Overview

Dataset statistics

Number of variables66
Number of observations158745
Missing cells0
Missing cells (%)0.0%
Duplicate rows18
Duplicate rows (%)< 0.1%
Total size in memory75.6 MiB
Average record size in memory499.6 B

Variable types

Numeric14
Categorical28
Boolean24

Alerts

Dataset has 18 (< 0.1%) duplicate rowsDuplicates
occupation_description has a high cardinality: 31889 distinct valuesHigh cardinality
risky_activities has a high cardinality: 112 distinct valuesHigh cardinality
chestpain_diagnosis has a high cardinality: 89 distinct valuesHigh cardinality
med_conditions has a high cardinality: 937 distinct valuesHigh cardinality
test_proc_type has a high cardinality: 226 distinct valuesHigh cardinality
disability_pmts_reason has a high cardinality: 88 distinct valuesHigh cardinality
mental_health_diagnosis has a high cardinality: 793 distinct valuesHigh cardinality
weight_loss_reason has a high cardinality: 166 distinct valuesHigh cardinality
surgery_type has a high cardinality: 137 distinct valuesHigh cardinality
travel_countries has a high cardinality: 1153 distinct valuesHigh cardinality
height is highly overall correlated with weight and 1 other fieldsHigh correlation
weight is highly overall correlated with height and 1 other fieldsHigh correlation
bmi_app_state is highly overall correlated with weightHigh correlation
diabetes_age is highly overall correlated with diabetes_hemoglobinHigh correlation
marijuana_monthly_count is highly overall correlated with marijuanaHigh correlation
diabetes_hemoglobin is highly overall correlated with diabetes_ageHigh correlation
gender is highly overall correlated with heightHigh correlation
marijuana is highly overall correlated with marijuana_monthly_countHigh correlation
previous_declined is highly overall correlated with previous_decline_reasonHigh correlation
previous_decline_reason is highly overall correlated with previous_declinedHigh correlation
citizen is highly overall correlated with legal_residentHigh correlation
legal_resident is highly overall correlated with citizenHigh correlation
replacement_ins is highly imbalanced (87.1%)Imbalance
risky_activities is highly imbalanced (97.0%)Imbalance
valid_drivers_license_app_state is highly imbalanced (92.5%)Imbalance
hiv_pos is highly imbalanced (90.9%)Imbalance
covid is highly imbalanced (91.3%)Imbalance
previous_declined is highly imbalanced (81.2%)Imbalance
previous_decline_reason is highly imbalanced (94.3%)Imbalance
chestpain_diagnosis is highly imbalanced (98.0%)Imbalance
diabetes_complications is highly imbalanced (99.1%)Imbalance
diabetes_gestational is highly imbalanced (94.6%)Imbalance
diabetes_hospitalization is highly imbalanced (98.5%)Imbalance
family_history is highly imbalanced (96.1%)Imbalance
final_expense is highly imbalanced (97.2%)Imbalance
inpatient is highly imbalanced (98.4%)Imbalance
med_advice is highly imbalanced (75.8%)Imbalance
med_conditions is highly imbalanced (82.5%)Imbalance
seizure_diagnosis is highly imbalanced (98.7%)Imbalance
stroke_diagnosis is highly imbalanced (99.7%)Imbalance
stroke_diagnosis_multiselect is highly imbalanced (98.6%)Imbalance
test_proc_outstanding is highly imbalanced (92.2%)Imbalance
test_proc_type is highly imbalanced (98.5%)Imbalance
climbing_equipment is highly imbalanced (97.0%)Imbalance
criminal_history is highly imbalanced (81.8%)Imbalance
disability_audio_visual is highly imbalanced (98.6%)Imbalance
disability_pmts_reason is highly imbalanced (89.2%)Imbalance
expected_travel_90_days is highly imbalanced (99.0%)Imbalance
expected_travel is highly imbalanced (72.5%)Imbalance
expected_travel_multiselect is highly imbalanced (99.8%)Imbalance
illicit_drugs is highly imbalanced (77.0%)Imbalance
mental_health_diagnosis is highly imbalanced (88.1%)Imbalance
mental_health_hospitalized is highly imbalanced (96.6%)Imbalance
mental_health_missed_work is highly imbalanced (99.3%)Imbalance
pilot_student_private is highly imbalanced (98.3%)Imbalance
racing_100mph is highly imbalanced (98.0%)Imbalance
rx_increase is highly imbalanced (99.2%)Imbalance
scuba_130ft is highly imbalanced (98.9%)Imbalance
seizure_car_accident is highly imbalanced (99.9%)Imbalance
weight_loss_reason is highly imbalanced (94.3%)Imbalance
cancer_type is highly imbalanced (95.6%)Imbalance
citizen is highly imbalanced (73.8%)Imbalance
legal_resident is highly imbalanced (93.2%)Imbalance
surgery_type is highly imbalanced (94.0%)Imbalance
travel_countries is highly imbalanced (94.5%)Imbalance
current_ins_value is highly skewed (γ1 = 98.33006521)Skewed
diabetes_hemoglobin is highly skewed (γ1 = 258.1510392)Skewed
stroke_count is highly skewed (γ1 = 124.2600253)Skewed
dui_count is highly skewed (γ1 = 25.83698756)Skewed
weight_loss_amount is highly skewed (γ1 = 160.8835149)Skewed
skydive_count is highly skewed (γ1 = 265.0356407)Skewed
income_app_state has 5340 (3.4%) zerosZeros
current_ins_value has 134699 (84.9%) zerosZeros
alcohol_weekly has 107390 (67.6%) zerosZeros
marijuana_monthly_count has 141142 (88.9%) zerosZeros
stroke_count has 158446 (99.8%) zerosZeros
dui_count has 154764 (97.5%) zerosZeros
weight_loss_amount has 139638 (88.0%) zerosZeros
skydive_count has 157778 (99.4%) zerosZeros

Reproduction

Analysis started2023-03-29 14:37:17.145272
Analysis finished2023-03-29 14:39:19.833886
Duration2 minutes and 2.69 seconds
Software versionpandas-profiling v3.6.6
Download configurationconfig.json

Variables

age
Real number (ℝ)

Distinct14871
Distinct (%)9.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean41.214577
Minimum18.004476
Maximum59.998494
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size2.4 MiB
2023-03-29T09:39:19.975462image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

Minimum18.004476
5-th percentile25.536459
Q134.108845
median41.492981
Q349.016749
95-th percentile54.700644
Maximum59.998494
Range41.994018
Interquartile range (IQR)14.907904

Descriptive statistics

Standard deviation9.2330695
Coefficient of variation (CV)0.22402437
Kurtosis-0.83797718
Mean41.214577
Median Absolute Deviation (MAD)7.4525829
Skewness-0.19271567
Sum6542608
Variance85.249572
MonotonicityNot monotonic
2023-03-29T09:39:20.109442image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
49.72039125 31
 
< 0.1%
45.77232934 29
 
< 0.1%
47.41781145 29
 
< 0.1%
39.29170346 29
 
< 0.1%
50.36927521 29
 
< 0.1%
49.98596823 29
 
< 0.1%
36.78104273 29
 
< 0.1%
50.45141242 28
 
< 0.1%
54.80194665 28
 
< 0.1%
51.60954708 28
 
< 0.1%
Other values (14861) 158456
99.8%
ValueCountFrequency (%)
18.00447648 1
< 0.1%
18.0126902 1
< 0.1%
18.01816601 1
< 0.1%
18.02637973 1
< 0.1%
18.09482741 2
< 0.1%
18.09756532 1
< 0.1%
18.10304113 1
< 0.1%
18.12220648 1
< 0.1%
18.13863392 1
< 0.1%
18.16053718 1
< 0.1%
ValueCountFrequency (%)
59.99849415 4
< 0.1%
59.99575624 3
 
< 0.1%
59.99301834 3
 
< 0.1%
59.99028043 3
 
< 0.1%
59.98754252 5
< 0.1%
59.98480462 7
< 0.1%
59.98206671 3
 
< 0.1%
59.9793288 8
< 0.1%
59.9765909 1
 
< 0.1%
59.97385299 2
 
< 0.1%

gender
Categorical

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.4 MiB
female
81013 
male
77732 

Length

Max length6
Median length6
Mean length5.0206684
Min length4

Characters and Unicode

Total characters797006
Distinct characters5
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowmale
2nd rowmale
3rd rowmale
4th rowfemale
5th rowfemale

Common Values

ValueCountFrequency (%)
female 81013
51.0%
male 77732
49.0%

Length

2023-03-29T09:39:20.242299image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-03-29T09:39:20.399934image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
ValueCountFrequency (%)
female 81013
51.0%
male 77732
49.0%

Most occurring characters

ValueCountFrequency (%)
e 239758
30.1%
m 158745
19.9%
a 158745
19.9%
l 158745
19.9%
f 81013
 
10.2%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 797006
100.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 239758
30.1%
m 158745
19.9%
a 158745
19.9%
l 158745
19.9%
f 81013
 
10.2%

Most occurring scripts

ValueCountFrequency (%)
Latin 797006
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 239758
30.1%
m 158745
19.9%
a 158745
19.9%
l 158745
19.9%
f 81013
 
10.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII 797006
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 239758
30.1%
m 158745
19.9%
a 158745
19.9%
l 158745
19.9%
f 81013
 
10.2%

height
Real number (ℝ)

Distinct33
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean67.398778
Minimum51
Maximum83
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size2.4 MiB
2023-03-29T09:39:20.564895image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

Minimum51
5-th percentile61
Q164
median67
Q370
95-th percentile74
Maximum83
Range32
Interquartile range (IQR)6

Descriptive statistics

Standard deviation4.0879204
Coefficient of variation (CV)0.060652738
Kurtosis-0.57614842
Mean67.398778
Median Absolute Deviation (MAD)3
Skewness0.088049693
Sum10699219
Variance16.711093
MonotonicityNot monotonic
2023-03-29T09:39:20.662180image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=33)
ValueCountFrequency (%)
67 14264
 
9.0%
66 13630
 
8.6%
64 12780
 
8.1%
69 12539
 
7.9%
70 12038
 
7.6%
68 11896
 
7.5%
65 11765
 
7.4%
71 11503
 
7.2%
72 10440
 
6.6%
63 10234
 
6.4%
Other values (23) 37656
23.7%
ValueCountFrequency (%)
51 2
 
< 0.1%
52 2
 
< 0.1%
53 2
 
< 0.1%
54 2
 
< 0.1%
55 2
 
< 0.1%
56 70
 
< 0.1%
57 212
 
0.1%
58 308
 
0.2%
59 1744
1.1%
60 3608
2.3%
ValueCountFrequency (%)
83 10
 
< 0.1%
82 17
 
< 0.1%
81 31
 
< 0.1%
80 63
 
< 0.1%
79 132
 
0.1%
78 304
 
0.2%
77 688
 
0.4%
76 1728
 
1.1%
75 3108
2.0%
74 5284
3.3%

weight
Real number (ℝ)

Distinct236
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean180.89312
Minimum86
Maximum335
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size2.4 MiB
2023-03-29T09:39:20.762437image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

Minimum86
5-th percentile125
Q1152
median180
Q3203
95-th percentile250
Maximum335
Range249
Interquartile range (IQR)51

Descriptive statistics

Standard deviation37.826967
Coefficient of variation (CV)0.20911225
Kurtosis-0.019634682
Mean180.89312
Median Absolute Deviation (MAD)25
Skewness0.44186586
Sum28715878
Variance1430.8795
MonotonicityNot monotonic
2023-03-29T09:39:20.916496image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
180 9030
 
5.7%
200 8937
 
5.6%
160 6731
 
4.2%
170 6682
 
4.2%
190 6555
 
4.1%
150 6296
 
4.0%
175 5736
 
3.6%
185 5317
 
3.3%
165 5285
 
3.3%
220 5002
 
3.2%
Other values (226) 93174
58.7%
ValueCountFrequency (%)
86 1
 
< 0.1%
90 4
 
< 0.1%
92 6
 
< 0.1%
93 3
 
< 0.1%
94 5
 
< 0.1%
95 53
< 0.1%
96 10
 
< 0.1%
97 11
 
< 0.1%
98 51
< 0.1%
99 14
 
< 0.1%
ValueCountFrequency (%)
335 1
 
< 0.1%
330 5
 
< 0.1%
329 1
 
< 0.1%
328 1
 
< 0.1%
327 1
 
< 0.1%
325 13
< 0.1%
324 1
 
< 0.1%
321 1
 
< 0.1%
320 32
< 0.1%
319 1
 
< 0.1%

state
Categorical

Distinct50
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.4 MiB
CA
16857 
TX
16714 
FL
12806 
GA
 
8171
PA
 
7151
Other values (45)
97046 

Length

Max length2
Median length2
Mean length2
Min length2

Characters and Unicode

Total characters317490
Distinct characters24
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowTX
2nd rowNM
3rd rowKS
4th rowOK
5th rowIN

Common Values

ValueCountFrequency (%)
CA 16857
 
10.6%
TX 16714
 
10.5%
FL 12806
 
8.1%
GA 8171
 
5.1%
PA 7151
 
4.5%
NC 5892
 
3.7%
OH 5499
 
3.5%
NJ 5384
 
3.4%
IL 5337
 
3.4%
VA 4617
 
2.9%
Other values (40) 70317
44.3%

Length

2023-03-29T09:39:21.042226image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
ca 16857
 
10.6%
tx 16714
 
10.5%
fl 12806
 
8.1%
ga 8171
 
5.1%
pa 7151
 
4.5%
nc 5892
 
3.7%
oh 5499
 
3.5%
nj 5384
 
3.4%
il 5337
 
3.4%
va 4617
 
2.9%
Other values (40) 70317
44.3%

Most occurring characters

ValueCountFrequency (%)
A 55588
17.5%
C 31419
9.9%
N 25141
 
7.9%
T 24738
 
7.8%
L 24118
 
7.6%
M 19601
 
6.2%
I 18261
 
5.8%
X 16714
 
5.3%
O 14922
 
4.7%
F 12806
 
4.0%
Other values (14) 74182
23.4%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 317490
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
A 55588
17.5%
C 31419
9.9%
N 25141
 
7.9%
T 24738
 
7.8%
L 24118
 
7.6%
M 19601
 
6.2%
I 18261
 
5.8%
X 16714
 
5.3%
O 14922
 
4.7%
F 12806
 
4.0%
Other values (14) 74182
23.4%

Most occurring scripts

ValueCountFrequency (%)
Latin 317490
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
A 55588
17.5%
C 31419
9.9%
N 25141
 
7.9%
T 24738
 
7.8%
L 24118
 
7.6%
M 19601
 
6.2%
I 18261
 
5.8%
X 16714
 
5.3%
O 14922
 
4.7%
F 12806
 
4.0%
Other values (14) 74182
23.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII 317490
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
A 55588
17.5%
C 31419
9.9%
N 25141
 
7.9%
T 24738
 
7.8%
L 24118
 
7.6%
M 19601
 
6.2%
I 18261
 
5.8%
X 16714
 
5.3%
O 14922
 
4.7%
F 12806
 
4.0%
Other values (14) 74182
23.4%
Distinct5
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.4 MiB
declined
94513 
select_nt
23118 
essential_nt
21681 
preferred_nt
10502 
elite_nt
 
8931

Length

Max length12
Median length8
Mean length8.9565656
Min length8

Characters and Unicode

Total characters1421810
Distinct characters13
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowdeclined
2nd rowdeclined
3rd rowdeclined
4th rowdeclined
5th rowdeclined

Common Values

ValueCountFrequency (%)
declined 94513
59.5%
select_nt 23118
 
14.6%
essential_nt 21681
 
13.7%
preferred_nt 10502
 
6.6%
elite_nt 8931
 
5.6%

Length

2023-03-29T09:39:21.162809image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-03-29T09:39:21.308331image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
ValueCountFrequency (%)
declined 94513
59.5%
select_nt 23118
 
14.6%
essential_nt 21681
 
13.7%
preferred_nt 10502
 
6.6%
elite_nt 8931
 
5.6%

Most occurring characters

ValueCountFrequency (%)
e 327992
23.1%
d 199528
14.0%
n 180426
12.7%
l 148243
10.4%
i 125125
 
8.8%
t 117962
 
8.3%
c 117631
 
8.3%
s 66480
 
4.7%
_ 64232
 
4.5%
r 31506
 
2.2%
Other values (3) 42685
 
3.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 1357578
95.5%
Connector Punctuation 64232
 
4.5%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 327992
24.2%
d 199528
14.7%
n 180426
13.3%
l 148243
10.9%
i 125125
 
9.2%
t 117962
 
8.7%
c 117631
 
8.7%
s 66480
 
4.9%
r 31506
 
2.3%
a 21681
 
1.6%
Other values (2) 21004
 
1.5%
Connector Punctuation
ValueCountFrequency (%)
_ 64232
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 1357578
95.5%
Common 64232
 
4.5%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 327992
24.2%
d 199528
14.7%
n 180426
13.3%
l 148243
10.9%
i 125125
 
9.2%
t 117962
 
8.7%
c 117631
 
8.7%
s 66480
 
4.9%
r 31506
 
2.3%
a 21681
 
1.6%
Other values (2) 21004
 
1.5%
Common
ValueCountFrequency (%)
_ 64232
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1421810
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 327992
23.1%
d 199528
14.0%
n 180426
12.7%
l 148243
10.4%
i 125125
 
8.8%
t 117962
 
8.3%
c 117631
 
8.3%
s 66480
 
4.7%
_ 64232
 
4.5%
r 31506
 
2.2%
Other values (3) 42685
 
3.0%

bmi_app_state
Real number (ℝ)

Distinct2863
Distinct (%)1.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean27.887274
Minimum18.507551
Maximum39.992889
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size2.4 MiB
2023-03-29T09:39:21.465764image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

Minimum18.507551
5-th percentile20.782048
Q124.371674
median27.405881
Q330.948639
95-th percentile36.614583
Maximum39.992889
Range21.485338
Interquartile range (IQR)6.5769648

Descriptive statistics

Standard deviation4.7357786
Coefficient of variation (CV)0.16981862
Kurtosis-0.42849006
Mean27.887274
Median Absolute Deviation (MAD)3.2575717
Skewness0.3899919
Sum4426965.4
Variance22.427599
MonotonicityNot monotonic
2023-03-29T09:39:21.630078image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
28.69387755 1046
 
0.7%
27.36591696 1026
 
0.6%
28.18890622 946
 
0.6%
26.57844991 930
 
0.6%
29.04958678 927
 
0.6%
27.89129141 923
 
0.6%
27.12191358 889
 
0.6%
31.32100691 856
 
0.5%
30.40657439 815
 
0.5%
25.82185491 788
 
0.5%
Other values (2853) 149599
94.2%
ValueCountFrequency (%)
18.50755102 3
 
< 0.1%
18.51491535 17
 
< 0.1%
18.53613281 13
 
< 0.1%
18.54770879 2
 
< 0.1%
18.54801038 8
 
< 0.1%
18.55138889 25
 
< 0.1%
18.55945822 94
0.1%
18.5785108 6
 
< 0.1%
18.57971847 3
 
< 0.1%
18.5978836 64
< 0.1%
ValueCountFrequency (%)
39.99288889 13
 
< 0.1%
39.98999023 9
 
< 0.1%
39.98678541 3
 
< 0.1%
39.98464533 7
 
< 0.1%
39.97166448 1
 
< 0.1%
39.9342838 40
 
< 0.1%
39.93372781 167
0.1%
39.92567568 1
 
< 0.1%
39.92105263 1
 
< 0.1%
39.88454672 7
 
< 0.1%

income_app_state
Real number (ℝ)

Distinct11989
Distinct (%)7.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean74073.337
Minimum0
Maximum2548000
Zeros5340
Zeros (%)3.4%
Negative0
Negative (%)0.0%
Memory size2.4 MiB
2023-03-29T09:39:21.723556image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile4800
Q124960
median44200
Q378000
95-th percentile197600
Maximum2548000
Range2548000
Interquartile range (IQR)53040

Descriptive statistics

Standard deviation140097.69
Coefficient of variation (CV)1.8913376
Kurtosis84.638165
Mean74073.337
Median Absolute Deviation (MAD)23400
Skewness8.1280328
Sum1.1758772 × 1010
Variance1.9627363 × 1010
MonotonicityNot monotonic
2023-03-29T09:39:21.857411image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 5340
 
3.4%
31200 4429
 
2.8%
52000 4021
 
2.5%
60000 3622
 
2.3%
26000 3039
 
1.9%
41600 2625
 
1.7%
78000 2439
 
1.5%
62400 2184
 
1.4%
48000 2130
 
1.3%
36000 2107
 
1.3%
Other values (11979) 126809
79.9%
ValueCountFrequency (%)
0 5340
3.4%
1 51
 
< 0.1%
2 4
 
< 0.1%
3 3
 
< 0.1%
4 2
 
< 0.1%
5 5
 
< 0.1%
6 1
 
< 0.1%
7 2
 
< 0.1%
8 1
 
< 0.1%
9 2
 
< 0.1%
ValueCountFrequency (%)
2548000 1
 
< 0.1%
2470000 1
 
< 0.1%
2448000 2
 
< 0.1%
2400000 5
< 0.1%
2340000 2
 
< 0.1%
2323000 1
 
< 0.1%
2288000 3
< 0.1%
2280000 3
< 0.1%
2271360 1
 
< 0.1%
2236000 1
 
< 0.1%
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.4 MiB
False
134595 
True
24150 
ValueCountFrequency (%)
False 134595
84.8%
True 24150
 
15.2%
2023-03-29T09:39:21.999281image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

current_ins_value
Real number (ℝ)

SKEWED  ZEROS 

Distinct882
Distinct (%)0.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean188493.35
Minimum0
Maximum1.8 × 109
Zeros134699
Zeros (%)84.9%
Negative0
Negative (%)0.0%
Memory size2.4 MiB
2023-03-29T09:39:22.140515image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile300000
Maximum1.8 × 109
Range1.8 × 109
Interquartile range (IQR)0

Descriptive statistics

Standard deviation11012516
Coefficient of variation (CV)58.423894
Kurtosis10748.56
Mean188493.35
Median Absolute Deviation (MAD)0
Skewness98.330065
Sum2.9922377 × 1010
Variance1.212755 × 1014
MonotonicityNot monotonic
2023-03-29T09:39:22.332306image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 134699
84.9%
100000 2763
 
1.7%
500000 2311
 
1.5%
250000 2070
 
1.3%
50000 1885
 
1.2%
1000000 1173
 
0.7%
300000 1145
 
0.7%
200000 1042
 
0.7%
150000 1000
 
0.6%
10000 953
 
0.6%
Other values (872) 9704
 
6.1%
ValueCountFrequency (%)
0 134699
84.9%
1 5
 
< 0.1%
2 2
 
< 0.1%
3 1
 
< 0.1%
5 5
 
< 0.1%
6 1
 
< 0.1%
7 3
 
< 0.1%
10 4
 
< 0.1%
11 1
 
< 0.1%
13 1
 
< 0.1%
ValueCountFrequency (%)
1800000000 1
 
< 0.1%
1000000000 12
< 0.1%
999999999 1
 
< 0.1%
980000000 1
 
< 0.1%
800000000 1
 
< 0.1%
625000000 1
 
< 0.1%
560000000 1
 
< 0.1%
300000000 2
 
< 0.1%
250000000 3
 
< 0.1%
200000000 4
 
< 0.1%
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.4 MiB
False
155916 
True
 
2829
ValueCountFrequency (%)
False 155916
98.2%
True 2829
 
1.8%
2023-03-29T09:39:22.456748image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

diabetes_age
Real number (ℝ)

Distinct61
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.2238874
Minimum-1
Maximum59
Zeros12
Zeros (%)< 0.1%
Negative149334
Negative (%)94.1%
Memory size2.4 MiB
2023-03-29T09:39:22.568065image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

Minimum-1
5-th percentile-1
Q1-1
median-1
Q3-1
95-th percentile25
Maximum59
Range60
Interquartile range (IQR)0

Descriptive statistics

Standard deviation9.2631923
Coefficient of variation (CV)7.568664
Kurtosis16.044981
Mean1.2238874
Median Absolute Deviation (MAD)0
Skewness4.1555268
Sum194286
Variance85.806732
MonotonicityNot monotonic
2023-03-29T09:39:22.744318image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
-1 149334
94.1%
40 762
 
0.5%
45 604
 
0.4%
35 553
 
0.3%
30 440
 
0.3%
42 358
 
0.2%
50 345
 
0.2%
38 341
 
0.2%
48 322
 
0.2%
46 283
 
0.2%
Other values (51) 5403
 
3.4%
ValueCountFrequency (%)
-1 149334
94.1%
0 12
 
< 0.1%
1 15
 
< 0.1%
2 17
 
< 0.1%
3 18
 
< 0.1%
4 18
 
< 0.1%
5 28
 
< 0.1%
6 26
 
< 0.1%
7 35
 
< 0.1%
8 43
 
< 0.1%
ValueCountFrequency (%)
59 5
 
< 0.1%
58 6
 
< 0.1%
57 12
 
< 0.1%
56 16
 
< 0.1%
55 26
 
< 0.1%
54 69
 
< 0.1%
53 115
 
0.1%
52 159
0.1%
51 138
 
0.1%
50 345
0.2%

alcohol_weekly
Real number (ℝ)

Distinct41
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.9336861
Minimum0
Maximum48
Zeros107390
Zeros (%)67.6%
Negative0
Negative (%)0.0%
Memory size2.4 MiB
2023-03-29T09:39:22.908345image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q31
95-th percentile5
Maximum48
Range48
Interquartile range (IQR)1

Descriptive statistics

Standard deviation2.2318959
Coefficient of variation (CV)2.3904136
Kurtosis65.762732
Mean0.9336861
Median Absolute Deviation (MAD)0
Skewness6.1427137
Sum148218
Variance4.9813594
MonotonicityNot monotonic
2023-03-29T09:39:23.049640image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=41)
ValueCountFrequency (%)
0 107390
67.6%
1 17910
 
11.3%
2 14854
 
9.4%
3 6405
 
4.0%
4 3908
 
2.5%
5 3171
 
2.0%
6 1658
 
1.0%
10 906
 
0.6%
7 779
 
0.5%
8 512
 
0.3%
Other values (31) 1252
 
0.8%
ValueCountFrequency (%)
0 107390
67.6%
1 17910
 
11.3%
2 14854
 
9.4%
3 6405
 
4.0%
4 3908
 
2.5%
5 3171
 
2.0%
6 1658
 
1.0%
7 779
 
0.5%
8 512
 
0.3%
9 68
 
< 0.1%
ValueCountFrequency (%)
48 2
 
< 0.1%
46 5
 
< 0.1%
45 5
 
< 0.1%
44 2
 
< 0.1%
42 2
 
< 0.1%
40 10
< 0.1%
36 6
 
< 0.1%
35 20
< 0.1%
34 12
< 0.1%
33 1
 
< 0.1%

marijuana_monthly_count
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct71
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.3284702
Minimum0
Maximum99
Zeros141142
Zeros (%)88.9%
Negative0
Negative (%)0.0%
Memory size2.4 MiB
2023-03-29T09:39:23.175654image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile6
Maximum99
Range99
Interquartile range (IQR)0

Descriptive statistics

Standard deviation5.8922364
Coefficient of variation (CV)4.4353546
Kurtosis57.162146
Mean1.3284702
Median Absolute Deviation (MAD)0
Skewness6.5770954
Sum210888
Variance34.71845
MonotonicityNot monotonic
2023-03-29T09:39:23.339407image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 141142
88.9%
30 2659
 
1.7%
2 2319
 
1.5%
1 2270
 
1.4%
4 1821
 
1.1%
10 1539
 
1.0%
5 1519
 
1.0%
3 1346
 
0.8%
20 905
 
0.6%
15 734
 
0.5%
Other values (61) 2491
 
1.6%
ValueCountFrequency (%)
0 141142
88.9%
1 2270
 
1.4%
2 2319
 
1.5%
3 1346
 
0.8%
4 1821
 
1.1%
5 1519
 
1.0%
6 392
 
0.2%
7 202
 
0.1%
8 349
 
0.2%
9 43
 
< 0.1%
ValueCountFrequency (%)
99 6
 
< 0.1%
96 1
 
< 0.1%
93 1
 
< 0.1%
90 64
< 0.1%
89 1
 
< 0.1%
86 2
 
< 0.1%
85 2
 
< 0.1%
84 1
 
< 0.1%
80 14
 
< 0.1%
79 1
 
< 0.1%

marijuana
Boolean

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.4 MiB
False
140536 
True
18209 
ValueCountFrequency (%)
False 140536
88.5%
True 18209
 
11.5%
2023-03-29T09:39:23.487426image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Distinct31889
Distinct (%)20.1%
Missing0
Missing (%)0.0%
Memory size2.4 MiB
other
41186 
Sales
 
1854
Manager
 
1446
Teacher
 
1438
Self employed
 
1293
Other values (31884)
111528 

Length

Max length339
Median length90
Mean length11.739986
Min length1

Characters and Unicode

Total characters1863664
Distinct characters109
Distinct categories17 ?
Distinct scripts4 ?
Distinct blocks7 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique25115 ?
Unique (%)15.8%

Sample

1st rowBuy and sell online
2nd rowother
3rd rowother
4th rowother
5th rowother

Common Values

ValueCountFrequency (%)
other 41186
 
25.9%
Sales 1854
 
1.2%
Manager 1446
 
0.9%
Teacher 1438
 
0.9%
Self employed 1293
 
0.8%
Truck driver 1092
 
0.7%
Registered Nurse 1082
 
0.7%
Nurse 1069
 
0.7%
Driver 973
 
0.6%
Construction 895
 
0.6%
Other values (31879) 106417
67.0%

Length

2023-03-29T09:39:23.628592image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
other 41188
 
16.0%
manager 10319
 
4.0%
driver 4625
 
1.8%
sales 4447
 
1.7%
nurse 3523
 
1.4%
assistant 3431
 
1.3%
service 2822
 
1.1%
owner 2726
 
1.1%
self 2668
 
1.0%
business 2589
 
1.0%
Other values (9358) 178361
69.5%

Most occurring characters

ValueCountFrequency (%)
e 237476
12.7%
r 190377
 
10.2%
t 148356
 
8.0%
a 130854
 
7.0%
o 129131
 
6.9%
i 110871
 
5.9%
n 109809
 
5.9%
98357
 
5.3%
s 86855
 
4.7%
c 68164
 
3.7%
Other values (99) 553414
29.7%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 1575167
84.5%
Uppercase Letter 184141
 
9.9%
Space Separator 98357
 
5.3%
Other Punctuation 3900
 
0.2%
Dash Punctuation 1068
 
0.1%
Decimal Number 491
 
< 0.1%
Open Punctuation 213
 
< 0.1%
Close Punctuation 209
 
< 0.1%
Final Punctuation 95
 
< 0.1%
Other Symbol 10
 
< 0.1%
Other values (7) 13
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 237476
15.1%
r 190377
12.1%
t 148356
9.4%
a 130854
8.3%
o 129131
8.2%
i 110871
 
7.0%
n 109809
 
7.0%
s 86855
 
5.5%
c 68164
 
4.3%
h 66022
 
4.2%
Other values (21) 297252
18.9%
Uppercase Letter
ValueCountFrequency (%)
S 21736
11.8%
C 21482
11.7%
M 16457
 
8.9%
A 15343
 
8.3%
P 12885
 
7.0%
T 11545
 
6.3%
D 11118
 
6.0%
R 10556
 
5.7%
E 9904
 
5.4%
O 7530
 
4.1%
Other values (17) 45585
24.8%
Other Punctuation
ValueCountFrequency (%)
/ 2134
54.7%
, 597
 
15.3%
. 593
 
15.2%
& 430
 
11.0%
' 97
 
2.5%
; 14
 
0.4%
@ 13
 
0.3%
: 11
 
0.3%
\ 7
 
0.2%
" 2
 
0.1%
Other values (2) 2
 
0.1%
Decimal Number
ValueCountFrequency (%)
1 138
28.1%
2 87
17.7%
0 69
14.1%
3 52
 
10.6%
9 48
 
9.8%
5 28
 
5.7%
4 25
 
5.1%
7 17
 
3.5%
6 14
 
2.9%
8 13
 
2.6%
Other Symbol
ValueCountFrequency (%)
🍳 1
10.0%
1
10.0%
🧑 1
10.0%
💈 1
10.0%
1
10.0%
👮 1
10.0%
🏬 1
10.0%
📦 1
10.0%
👨 1
10.0%
® 1
10.0%
Dash Punctuation
ValueCountFrequency (%)
- 1063
99.5%
4
 
0.4%
1
 
0.1%
Open Punctuation
ValueCountFrequency (%)
( 212
99.5%
[ 1
 
0.5%
Close Punctuation
ValueCountFrequency (%)
) 208
99.5%
] 1
 
0.5%
Final Punctuation
ValueCountFrequency (%)
94
98.9%
1
 
1.1%
Math Symbol
ValueCountFrequency (%)
+ 3
60.0%
| 2
40.0%
Other Letter
ValueCountFrequency (%)
1
50.0%
1
50.0%
Space Separator
ValueCountFrequency (%)
98357
100.0%
Format
ValueCountFrequency (%)
2
100.0%
Other Number
ValueCountFrequency (%)
1
100.0%
Initial Punctuation
ValueCountFrequency (%)
1
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 1
100.0%
Nonspacing Mark
ValueCountFrequency (%)
1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 1759308
94.4%
Common 104351
 
5.6%
Inherited 3
 
< 0.1%
Han 2
 
< 0.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 237476
13.5%
r 190377
 
10.8%
t 148356
 
8.4%
a 130854
 
7.4%
o 129131
 
7.3%
i 110871
 
6.3%
n 109809
 
6.2%
s 86855
 
4.9%
c 68164
 
3.9%
h 66022
 
3.8%
Other values (48) 481393
27.4%
Common
ValueCountFrequency (%)
98357
94.3%
/ 2134
 
2.0%
- 1063
 
1.0%
, 597
 
0.6%
. 593
 
0.6%
& 430
 
0.4%
( 212
 
0.2%
) 208
 
0.2%
1 138
 
0.1%
' 97
 
0.1%
Other values (37) 522
 
0.5%
Inherited
ValueCountFrequency (%)
2
66.7%
1
33.3%
Han
ValueCountFrequency (%)
1
50.0%
1
50.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1863506
> 99.9%
Punctuation 104
 
< 0.1%
None 49
 
< 0.1%
CJK 2
 
< 0.1%
Misc Symbols 1
 
< 0.1%
Letterlike Symbols 1
 
< 0.1%
VS 1
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 237476
12.7%
r 190377
 
10.2%
t 148356
 
8.0%
a 130854
 
7.0%
o 129131
 
6.9%
i 110871
 
5.9%
n 109809
 
5.9%
98357
 
5.3%
s 86855
 
4.7%
c 68164
 
3.7%
Other values (72) 553256
29.7%
Punctuation
ValueCountFrequency (%)
94
90.4%
4
 
3.8%
2
 
1.9%
1
 
1.0%
1
 
1.0%
1
 
1.0%
1
 
1.0%
None
ValueCountFrequency (%)
ó 29
59.2%
í 4
 
8.2%
á 4
 
8.2%
🍳 1
 
2.0%
🧑 1
 
2.0%
💈 1
 
2.0%
É 1
 
2.0%
👮 1
 
2.0%
1
 
2.0%
é 1
 
2.0%
Other values (5) 5
 
10.2%
Misc Symbols
ValueCountFrequency (%)
1
100.0%
Letterlike Symbols
ValueCountFrequency (%)
1
100.0%
CJK
ValueCountFrequency (%)
1
50.0%
1
50.0%
VS
ValueCountFrequency (%)
1
100.0%

risky_activities
Categorical

HIGH CARDINALITY  IMBALANCE 

Distinct112
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size2.4 MiB
none
155872 
scuba
 
576
climb
 
496
skydive
 
417
pilot
 
354
Other values (107)
 
1030

Length

Max length34
Median length4
Mean length4.0857917
Min length4

Characters and Unicode

Total characters648599
Distinct characters20
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique52 ?
Unique (%)< 0.1%

Sample

1st rownone
2nd rownone
3rd rownone
4th rownone
5th rownone

Common Values

ValueCountFrequency (%)
none 155872
98.2%
scuba 576
 
0.4%
climb 496
 
0.3%
skydive 417
 
0.3%
pilot 354
 
0.2%
race 275
 
0.2%
scuba, skydive 73
 
< 0.1%
pilot, scuba, race, climb, skydive 61
 
< 0.1%
scuba, climb 61
 
< 0.1%
climb, skydive 60
 
< 0.1%
Other values (102) 500
 
0.3%

Length

2023-03-29T09:39:23.841194image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
none 155872
97.4%
scuba 1098
 
0.7%
climb 990
 
0.6%
skydive 949
 
0.6%
race 609
 
0.4%
pilot 578
 
0.4%

Most occurring characters

ValueCountFrequency (%)
n 311744
48.1%
e 157430
24.3%
o 156450
24.1%
c 2697
 
0.4%
i 2517
 
0.4%
b 2088
 
0.3%
s 2047
 
0.3%
a 1707
 
0.3%
l 1568
 
0.2%
, 1351
 
0.2%
Other values (10) 9000
 
1.4%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 645897
99.6%
Other Punctuation 1351
 
0.2%
Space Separator 1351
 
0.2%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
n 311744
48.3%
e 157430
24.4%
o 156450
24.2%
c 2697
 
0.4%
i 2517
 
0.4%
b 2088
 
0.3%
s 2047
 
0.3%
a 1707
 
0.3%
l 1568
 
0.2%
u 1098
 
0.2%
Other values (8) 6551
 
1.0%
Other Punctuation
ValueCountFrequency (%)
, 1351
100.0%
Space Separator
ValueCountFrequency (%)
1351
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 645897
99.6%
Common 2702
 
0.4%

Most frequent character per script

Latin
ValueCountFrequency (%)
n 311744
48.3%
e 157430
24.4%
o 156450
24.2%
c 2697
 
0.4%
i 2517
 
0.4%
b 2088
 
0.3%
s 2047
 
0.3%
a 1707
 
0.3%
l 1568
 
0.2%
u 1098
 
0.2%
Other values (8) 6551
 
1.0%
Common
ValueCountFrequency (%)
, 1351
50.0%
1351
50.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 648599
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
n 311744
48.1%
e 157430
24.3%
o 156450
24.1%
c 2697
 
0.4%
i 2517
 
0.4%
b 2088
 
0.3%
s 2047
 
0.3%
a 1707
 
0.3%
l 1568
 
0.2%
, 1351
 
0.2%
Other values (10) 9000
 
1.4%
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.4 MiB
True
157303 
False
 
1442
ValueCountFrequency (%)
True 157303
99.1%
False 1442
 
0.9%
2023-03-29T09:39:23.990978image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

hiv_pos
Boolean

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.4 MiB
False
156908 
True
 
1837
ValueCountFrequency (%)
False 156908
98.8%
True 1837
 
1.2%
2023-03-29T09:39:24.115925image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

covid
Boolean

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.4 MiB
False
157003 
True
 
1742
ValueCountFrequency (%)
False 157003
98.9%
True 1742
 
1.1%
2023-03-29T09:39:24.225851image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

previous_declined
Boolean

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.4 MiB
False
154169 
True
 
4576
ValueCountFrequency (%)
False 154169
97.1%
True 4576
 
2.9%
2023-03-29T09:39:24.309957image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

previous_decline_reason
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct21
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.4 MiB
none
153957 
medical
 
2714
other
 
1067
unknown
 
666
criminal
 
211
Other values (16)
 
130

Length

Max length33
Median length4
Mean length4.0856153
Min length4

Characters and Unicode

Total characters648571
Distinct characters17
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique5 ?
Unique (%)< 0.1%

Sample

1st rownone
2nd rownone
3rd rownone
4th rownone
5th rownone

Common Values

ValueCountFrequency (%)
none 153957
97.0%
medical 2714
 
1.7%
other 1067
 
0.7%
unknown 666
 
0.4%
criminal 211
 
0.1%
medical, other 36
 
< 0.1%
medical, unknown 19
 
< 0.1%
medical, criminal 18
 
< 0.1%
unknown, other 10
 
< 0.1%
other, medical 9
 
< 0.1%
Other values (11) 38
 
< 0.1%

Length

2023-03-29T09:39:24.427904image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
none 153957
96.9%
medical 2814
 
1.8%
other 1147
 
0.7%
unknown 706
 
0.4%
criminal 261
 
0.2%

Most occurring characters

ValueCountFrequency (%)
n 310293
47.8%
e 157918
24.3%
o 155810
24.0%
i 3336
 
0.5%
c 3075
 
0.5%
a 3075
 
0.5%
l 3075
 
0.5%
m 3075
 
0.5%
d 2814
 
0.4%
r 1408
 
0.2%
Other values (7) 4692
 
0.7%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 648291
> 99.9%
Other Punctuation 140
 
< 0.1%
Space Separator 140
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
n 310293
47.9%
e 157918
24.4%
o 155810
24.0%
i 3336
 
0.5%
c 3075
 
0.5%
a 3075
 
0.5%
l 3075
 
0.5%
m 3075
 
0.5%
d 2814
 
0.4%
r 1408
 
0.2%
Other values (5) 4412
 
0.7%
Other Punctuation
ValueCountFrequency (%)
, 140
100.0%
Space Separator
ValueCountFrequency (%)
140
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 648291
> 99.9%
Common 280
 
< 0.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
n 310293
47.9%
e 157918
24.4%
o 155810
24.0%
i 3336
 
0.5%
c 3075
 
0.5%
a 3075
 
0.5%
l 3075
 
0.5%
m 3075
 
0.5%
d 2814
 
0.4%
r 1408
 
0.2%
Other values (5) 4412
 
0.7%
Common
ValueCountFrequency (%)
, 140
50.0%
140
50.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 648571
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
n 310293
47.8%
e 157918
24.3%
o 155810
24.0%
i 3336
 
0.5%
c 3075
 
0.5%
a 3075
 
0.5%
l 3075
 
0.5%
m 3075
 
0.5%
d 2814
 
0.4%
r 1408
 
0.2%
Other values (7) 4692
 
0.7%

chestpain_diagnosis
Categorical

HIGH CARDINALITY  IMBALANCE 

Distinct89
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size2.4 MiB
none
156949 
other
 
484
cardiac_chest_pain
 
272
heartburn
 
211
muscle_strain
 
188
Other values (84)
 
641

Length

Max length72
Median length4
Mean length4.1153926
Min length4

Characters and Unicode

Total characters653298
Distinct characters20
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique36 ?
Unique (%)< 0.1%

Sample

1st rownone
2nd rownone
3rd rownone
4th rownone
5th rownone

Common Values

ValueCountFrequency (%)
none 156949
98.9%
other 484
 
0.3%
cardiac_chest_pain 272
 
0.2%
heartburn 211
 
0.1%
muscle_strain 188
 
0.1%
angina 155
 
0.1%
heartburn, indigestion 81
 
0.1%
indigestion 74
 
< 0.1%
heartburn, muscle_strain 36
 
< 0.1%
heartburn, cardiac_chest_pain 20
 
< 0.1%
Other values (79) 275
 
0.2%

Length

2023-03-29T09:39:24.554257image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
none 156949
98.5%
other 584
 
0.4%
heartburn 498
 
0.3%
cardiac_chest_pain 422
 
0.3%
muscle_strain 361
 
0.2%
indigestion 293
 
0.2%
angina 233
 
0.1%

Most occurring characters

ValueCountFrequency (%)
n 316231
48.4%
e 159107
24.4%
o 157826
24.2%
a 2591
 
0.4%
r 2363
 
0.4%
i 2317
 
0.4%
t 2158
 
0.3%
c 1627
 
0.2%
h 1504
 
0.2%
s 1437
 
0.2%
Other values (10) 6137
 
0.9%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 650903
99.6%
Connector Punctuation 1205
 
0.2%
Other Punctuation 595
 
0.1%
Space Separator 595
 
0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
n 316231
48.6%
e 159107
24.4%
o 157826
24.2%
a 2591
 
0.4%
r 2363
 
0.4%
i 2317
 
0.4%
t 2158
 
0.3%
c 1627
 
0.2%
h 1504
 
0.2%
s 1437
 
0.2%
Other values (7) 3742
 
0.6%
Connector Punctuation
ValueCountFrequency (%)
_ 1205
100.0%
Other Punctuation
ValueCountFrequency (%)
, 595
100.0%
Space Separator
ValueCountFrequency (%)
595
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 650903
99.6%
Common 2395
 
0.4%

Most frequent character per script

Latin
ValueCountFrequency (%)
n 316231
48.6%
e 159107
24.4%
o 157826
24.2%
a 2591
 
0.4%
r 2363
 
0.4%
i 2317
 
0.4%
t 2158
 
0.3%
c 1627
 
0.2%
h 1504
 
0.2%
s 1437
 
0.2%
Other values (7) 3742
 
0.6%
Common
ValueCountFrequency (%)
_ 1205
50.3%
, 595
24.8%
595
24.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII 653298
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
n 316231
48.4%
e 159107
24.4%
o 157826
24.2%
a 2591
 
0.4%
r 2363
 
0.4%
i 2317
 
0.4%
t 2158
 
0.3%
c 1627
 
0.2%
h 1504
 
0.2%
s 1437
 
0.2%
Other values (10) 6137
 
0.9%
Distinct11
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.4 MiB
none
158325 
neuropathy
 
259
kidney_disease
 
99
amputation
 
30
neuropathy, kidney_disease
 
15
Other values (6)
 
17

Length

Max length38
Median length4
Mean length4.02167
Min length4

Characters and Unicode

Total characters638420
Distinct characters18
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique3 ?
Unique (%)< 0.1%

Sample

1st rownone
2nd rownone
3rd rownone
4th rownone
5th rownone

Common Values

ValueCountFrequency (%)
none 158325
99.7%
neuropathy 259
 
0.2%
kidney_disease 99
 
0.1%
amputation 30
 
< 0.1%
neuropathy, kidney_disease 15
 
< 0.1%
amputation, neuropathy 8
 
< 0.1%
kidney_disease, neuropathy 4
 
< 0.1%
amputation, neuropathy, kidney_disease 2
 
< 0.1%
amputation, kidney_disease, neuropathy 1
 
< 0.1%
neuropathy, amputation 1
 
< 0.1%

Length

2023-03-29T09:39:24.713239image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
none 158325
99.7%
neuropathy 291
 
0.2%
kidney_disease 122
 
0.1%
amputation 43
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
n 317106
49.7%
e 158982
24.9%
o 158659
24.9%
a 499
 
0.1%
y 413
 
0.1%
t 377
 
0.1%
p 334
 
0.1%
u 334
 
0.1%
r 291
 
< 0.1%
h 291
 
< 0.1%
Other values (8) 1134
 
0.2%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 638226
> 99.9%
Connector Punctuation 122
 
< 0.1%
Other Punctuation 36
 
< 0.1%
Space Separator 36
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
n 317106
49.7%
e 158982
24.9%
o 158659
24.9%
a 499
 
0.1%
y 413
 
0.1%
t 377
 
0.1%
p 334
 
0.1%
u 334
 
0.1%
r 291
 
< 0.1%
h 291
 
< 0.1%
Other values (5) 940
 
0.1%
Connector Punctuation
ValueCountFrequency (%)
_ 122
100.0%
Other Punctuation
ValueCountFrequency (%)
, 36
100.0%
Space Separator
ValueCountFrequency (%)
36
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 638226
> 99.9%
Common 194
 
< 0.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
n 317106
49.7%
e 158982
24.9%
o 158659
24.9%
a 499
 
0.1%
y 413
 
0.1%
t 377
 
0.1%
p 334
 
0.1%
u 334
 
0.1%
r 291
 
< 0.1%
h 291
 
< 0.1%
Other values (5) 940
 
0.1%
Common
ValueCountFrequency (%)
_ 122
62.9%
, 36
 
18.6%
36
 
18.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 638420
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
n 317106
49.7%
e 158982
24.9%
o 158659
24.9%
a 499
 
0.1%
y 413
 
0.1%
t 377
 
0.1%
p 334
 
0.1%
u 334
 
0.1%
r 291
 
< 0.1%
h 291
 
< 0.1%
Other values (8) 1134
 
0.2%
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.4 MiB
False
157765 
True
 
980
ValueCountFrequency (%)
False 157765
99.4%
True 980
 
0.6%
2023-03-29T09:39:24.860090image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

diabetes_hemoglobin
Real number (ℝ)

HIGH CORRELATION  SKEWED 

Distinct296
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.4113767
Minimum-1
Maximum123456
Zeros57
Zeros (%)< 0.1%
Negative154333
Negative (%)97.2%
Memory size2.4 MiB
2023-03-29T09:39:24.991953image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

Minimum-1
5-th percentile-1
Q1-1
median-1
Q3-1
95-th percentile-1
Maximum123456
Range123457
Interquartile range (IQR)0

Descriptive statistics

Standard deviation421.6662
Coefficient of variation (CV)298.76233
Kurtosis69169.909
Mean1.4113767
Median Absolute Deviation (MAD)0
Skewness258.15104
Sum224049
Variance177802.39
MonotonicityNot monotonic
2023-03-29T09:39:25.180159image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
-1 154333
97.2%
7 476
 
0.3%
6 379
 
0.2%
8 245
 
0.2%
6.5 162
 
0.1%
6.1 132
 
0.1%
6.7 128
 
0.1%
7.1 124
 
0.1%
9 118
 
0.1%
7.2 109
 
0.1%
Other values (286) 2539
 
1.6%
ValueCountFrequency (%)
-1 154333
97.2%
0 57
 
< 0.1%
0.3 1
 
< 0.1%
1 9
 
< 0.1%
1.3 1
 
< 0.1%
1.4 1
 
< 0.1%
1.5 1
 
< 0.1%
2 10
 
< 0.1%
2.2 1
 
< 0.1%
2.3 1
 
< 0.1%
ValueCountFrequency (%)
123456 1
< 0.1%
102021 1
< 0.1%
50522 1
< 0.1%
2019 1
< 0.1%
1750 1
< 0.1%
900 1
< 0.1%
801 1
< 0.1%
710 1
< 0.1%
700 1
< 0.1%
685 1
< 0.1%
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.4 MiB
False
158534 
True
 
211
ValueCountFrequency (%)
False 158534
99.9%
True 211
 
0.1%
2023-03-29T09:39:25.274571image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Distinct8
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.4 MiB
fulltime
104302 
unemployed
15238 
parttime
13262 
stay_at_home
 
8986
other
 
8899
Other values (3)
 
8058

Length

Max length16
Median length8
Mean length8.4345838
Min length5

Characters and Unicode

Total characters1338948
Distinct characters17
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowparttime
2nd rowunemployed
3rd rowstay_at_home
4th rowother
5th rowfulltime_student

Common Values

ValueCountFrequency (%)
fulltime 104302
65.7%
unemployed 15238
 
9.6%
parttime 13262
 
8.4%
stay_at_home 8986
 
5.7%
other 8899
 
5.6%
retired 3911
 
2.5%
fulltime_student 3078
 
1.9%
parttime_student 1069
 
0.7%

Length

2023-03-29T09:39:25.394677image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-03-29T09:39:25.494516image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
ValueCountFrequency (%)
fulltime 104302
65.7%
unemployed 15238
 
9.6%
parttime 13262
 
8.4%
stay_at_home 8986
 
5.7%
other 8899
 
5.6%
retired 3911
 
2.5%
fulltime_student 3078
 
1.9%
parttime_student 1069
 
0.7%

Most occurring characters

ValueCountFrequency (%)
l 229998
17.2%
e 182041
13.6%
t 175118
13.1%
m 145935
10.9%
u 126765
9.5%
i 125622
9.4%
f 107380
8.0%
o 33123
 
2.5%
a 32303
 
2.4%
r 31052
 
2.3%
Other values (7) 149611
11.2%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 1316829
98.3%
Connector Punctuation 22119
 
1.7%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
l 229998
17.5%
e 182041
13.8%
t 175118
13.3%
m 145935
11.1%
u 126765
9.6%
i 125622
9.5%
f 107380
8.2%
o 33123
 
2.5%
a 32303
 
2.5%
r 31052
 
2.4%
Other values (6) 127492
9.7%
Connector Punctuation
ValueCountFrequency (%)
_ 22119
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 1316829
98.3%
Common 22119
 
1.7%

Most frequent character per script

Latin
ValueCountFrequency (%)
l 229998
17.5%
e 182041
13.8%
t 175118
13.3%
m 145935
11.1%
u 126765
9.6%
i 125622
9.5%
f 107380
8.2%
o 33123
 
2.5%
a 32303
 
2.5%
r 31052
 
2.4%
Other values (6) 127492
9.7%
Common
ValueCountFrequency (%)
_ 22119
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1338948
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
l 229998
17.2%
e 182041
13.6%
t 175118
13.1%
m 145935
10.9%
u 126765
9.5%
i 125622
9.4%
f 107380
8.0%
o 33123
 
2.5%
a 32303
 
2.4%
r 31052
 
2.3%
Other values (7) 149611
11.2%

family_history
Categorical

Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.4 MiB
no
157737 
yes
 
638
unknown
 
370

Length

Max length7
Median length2
Mean length2.0156729
Min length2

Characters and Unicode

Total characters319978
Distinct characters8
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowno
2nd rowno
3rd rowno
4th rowyes
5th rowno

Common Values

ValueCountFrequency (%)
no 157737
99.4%
yes 638
 
0.4%
unknown 370
 
0.2%

Length

2023-03-29T09:39:25.620822image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-03-29T09:39:25.763082image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
ValueCountFrequency (%)
no 157737
99.4%
yes 638
 
0.4%
unknown 370
 
0.2%

Most occurring characters

ValueCountFrequency (%)
n 158847
49.6%
o 158107
49.4%
y 638
 
0.2%
e 638
 
0.2%
s 638
 
0.2%
u 370
 
0.1%
k 370
 
0.1%
w 370
 
0.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 319978
100.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
n 158847
49.6%
o 158107
49.4%
y 638
 
0.2%
e 638
 
0.2%
s 638
 
0.2%
u 370
 
0.1%
k 370
 
0.1%
w 370
 
0.1%

Most occurring scripts

ValueCountFrequency (%)
Latin 319978
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
n 158847
49.6%
o 158107
49.4%
y 638
 
0.2%
e 638
 
0.2%
s 638
 
0.2%
u 370
 
0.1%
k 370
 
0.1%
w 370
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 319978
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
n 158847
49.6%
o 158107
49.4%
y 638
 
0.2%
e 638
 
0.2%
s 638
 
0.2%
u 370
 
0.1%
k 370
 
0.1%
w 370
 
0.1%

final_expense
Categorical

Distinct34
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.4 MiB
none
156562 
assistance
 
766
oxygen
 
457
wheelchair
 
302
memory
 
299
Other values (29)
 
359

Length

Max length38
Median length4
Mean length4.0903714
Min length4

Characters and Unicode

Total characters649326
Distinct characters18
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique10 ?
Unique (%)< 0.1%

Sample

1st rownone
2nd rownone
3rd rownone
4th rownone
5th rownone

Common Values

ValueCountFrequency (%)
none 156562
98.6%
assistance 766
 
0.5%
oxygen 457
 
0.3%
wheelchair 302
 
0.2%
memory 299
 
0.2%
wheelchair, assistance 118
 
0.1%
oxygen, assistance 62
 
< 0.1%
memory, assistance 51
 
< 0.1%
wheelchair, oxygen, assistance 25
 
< 0.1%
assistance, wheelchair 22
 
< 0.1%
Other values (24) 81
 
0.1%

Length

2023-03-29T09:39:25.883595image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
none 156562
98.4%
assistance 1099
 
0.7%
oxygen 604
 
0.4%
wheelchair 516
 
0.3%
memory 404
 
0.3%

Most occurring characters

ValueCountFrequency (%)
n 314827
48.5%
e 159701
24.6%
o 157570
24.3%
s 3297
 
0.5%
a 2714
 
0.4%
i 1615
 
0.2%
c 1615
 
0.2%
t 1099
 
0.2%
h 1032
 
0.2%
y 1008
 
0.2%
Other values (8) 4848
 
0.7%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 648446
99.9%
Other Punctuation 440
 
0.1%
Space Separator 440
 
0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
n 314827
48.6%
e 159701
24.6%
o 157570
24.3%
s 3297
 
0.5%
a 2714
 
0.4%
i 1615
 
0.2%
c 1615
 
0.2%
t 1099
 
0.2%
h 1032
 
0.2%
y 1008
 
0.2%
Other values (6) 3968
 
0.6%
Other Punctuation
ValueCountFrequency (%)
, 440
100.0%
Space Separator
ValueCountFrequency (%)
440
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 648446
99.9%
Common 880
 
0.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
n 314827
48.6%
e 159701
24.6%
o 157570
24.3%
s 3297
 
0.5%
a 2714
 
0.4%
i 1615
 
0.2%
c 1615
 
0.2%
t 1099
 
0.2%
h 1032
 
0.2%
y 1008
 
0.2%
Other values (6) 3968
 
0.6%
Common
ValueCountFrequency (%)
, 440
50.0%
440
50.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 649326
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
n 314827
48.5%
e 159701
24.6%
o 157570
24.3%
s 3297
 
0.5%
a 2714
 
0.4%
i 1615
 
0.2%
c 1615
 
0.2%
t 1099
 
0.2%
h 1032
 
0.2%
y 1008
 
0.2%
Other values (8) 4848
 
0.7%

inpatient
Categorical

Distinct10
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.4 MiB
none
157924 
hospital
 
519
longterm_care
 
157
hospice
 
106
hospital, longterm_care
 
21
Other values (5)
 
18

Length

Max length32
Median length4
Mean length4.029223
Min length4

Characters and Unicode

Total characters639619
Distinct characters17
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2 ?
Unique (%)< 0.1%

Sample

1st rownone
2nd rownone
3rd rownone
4th rownone
5th rownone

Common Values

ValueCountFrequency (%)
none 157924
99.5%
hospital 519
 
0.3%
longterm_care 157
 
0.1%
hospice 106
 
0.1%
hospital, longterm_care 21
 
< 0.1%
hospital, longterm_care, hospice 10
 
< 0.1%
longterm_care, hospital 4
 
< 0.1%
longterm_care, hospice 2
 
< 0.1%
longterm_care, hospital, hospice 1
 
< 0.1%
hospital, hospice 1
 
< 0.1%

Length

2023-03-29T09:39:26.027809image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-03-29T09:39:26.201019image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
ValueCountFrequency (%)
none 157924
99.5%
hospital 556
 
0.4%
longterm_care 195
 
0.1%
hospice 120
 
0.1%

Most occurring characters

ValueCountFrequency (%)
n 316043
49.4%
o 158795
24.8%
e 158434
24.8%
t 751
 
0.1%
l 751
 
0.1%
a 751
 
0.1%
i 676
 
0.1%
p 676
 
0.1%
s 676
 
0.1%
h 676
 
0.1%
Other values (7) 1390
 
0.2%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 639324
> 99.9%
Connector Punctuation 195
 
< 0.1%
Other Punctuation 50
 
< 0.1%
Space Separator 50
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
n 316043
49.4%
o 158795
24.8%
e 158434
24.8%
t 751
 
0.1%
l 751
 
0.1%
a 751
 
0.1%
i 676
 
0.1%
p 676
 
0.1%
s 676
 
0.1%
h 676
 
0.1%
Other values (4) 1095
 
0.2%
Connector Punctuation
ValueCountFrequency (%)
_ 195
100.0%
Other Punctuation
ValueCountFrequency (%)
, 50
100.0%
Space Separator
ValueCountFrequency (%)
50
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 639324
> 99.9%
Common 295
 
< 0.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
n 316043
49.4%
o 158795
24.8%
e 158434
24.8%
t 751
 
0.1%
l 751
 
0.1%
a 751
 
0.1%
i 676
 
0.1%
p 676
 
0.1%
s 676
 
0.1%
h 676
 
0.1%
Other values (4) 1095
 
0.2%
Common
ValueCountFrequency (%)
_ 195
66.1%
, 50
 
16.9%
50
 
16.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII 639619
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
n 316043
49.4%
o 158795
24.8%
e 158434
24.8%
t 751
 
0.1%
l 751
 
0.1%
a 751
 
0.1%
i 676
 
0.1%
p 676
 
0.1%
s 676
 
0.1%
h 676
 
0.1%
Other values (7) 1390
 
0.2%

med_advice
Categorical

Distinct5
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.4 MiB
none
144066 
surgery
 
7108
test_procedure
 
6808
surgery, test_procedure
 
672
test_procedure, surgery
 
91

Length

Max length23
Median length4
Mean length4.6545151
Min length4

Characters and Unicode

Total characters738881
Distinct characters15
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rownone
2nd rownone
3rd rowtest_procedure
4th rownone
5th rowtest_procedure

Common Values

ValueCountFrequency (%)
none 144066
90.8%
surgery 7108
 
4.5%
test_procedure 6808
 
4.3%
surgery, test_procedure 672
 
0.4%
test_procedure, surgery 91
 
0.1%

Length

2023-03-29T09:39:26.374711image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-03-29T09:39:26.492586image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
ValueCountFrequency (%)
none 144066
90.3%
surgery 7871
 
4.9%
test_procedure 7571
 
4.7%

Most occurring characters

ValueCountFrequency (%)
n 288132
39.0%
e 174650
23.6%
o 151637
20.5%
r 30884
 
4.2%
s 15442
 
2.1%
u 15442
 
2.1%
t 15142
 
2.0%
g 7871
 
1.1%
y 7871
 
1.1%
_ 7571
 
1.0%
Other values (5) 24239
 
3.3%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 729784
98.8%
Connector Punctuation 7571
 
1.0%
Other Punctuation 763
 
0.1%
Space Separator 763
 
0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
n 288132
39.5%
e 174650
23.9%
o 151637
20.8%
r 30884
 
4.2%
s 15442
 
2.1%
u 15442
 
2.1%
t 15142
 
2.1%
g 7871
 
1.1%
y 7871
 
1.1%
p 7571
 
1.0%
Other values (2) 15142
 
2.1%
Connector Punctuation
ValueCountFrequency (%)
_ 7571
100.0%
Other Punctuation
ValueCountFrequency (%)
, 763
100.0%
Space Separator
ValueCountFrequency (%)
763
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 729784
98.8%
Common 9097
 
1.2%

Most frequent character per script

Latin
ValueCountFrequency (%)
n 288132
39.5%
e 174650
23.9%
o 151637
20.8%
r 30884
 
4.2%
s 15442
 
2.1%
u 15442
 
2.1%
t 15142
 
2.1%
g 7871
 
1.1%
y 7871
 
1.1%
p 7571
 
1.0%
Other values (2) 15142
 
2.1%
Common
ValueCountFrequency (%)
_ 7571
83.2%
, 763
 
8.4%
763
 
8.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII 738881
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
n 288132
39.0%
e 174650
23.6%
o 151637
20.5%
r 30884
 
4.2%
s 15442
 
2.1%
u 15442
 
2.1%
t 15142
 
2.0%
g 7871
 
1.1%
y 7871
 
1.1%
_ 7571
 
1.0%
Other values (5) 24239
 
3.3%

med_conditions
Categorical

HIGH CARDINALITY  IMBALANCE 

Distinct937
Distinct (%)0.6%
Missing0
Missing (%)0.0%
Memory size2.4 MiB
none
121947 
depression
13703 
diabetes
 
6935
cancer
 
2294
heart_disease
 
1771
Other values (932)
 
12095

Length

Max length282
Median length4
Mean length6.0990393
Min length2

Characters and Unicode

Total characters968192
Distinct characters24
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique567 ?
Unique (%)0.4%

Sample

1st rownone
2nd rownone
3rd rowliver_cirrhosis
4th rownone
5th rowdepression

Common Values

ValueCountFrequency (%)
none 121947
76.8%
depression 13703
 
8.6%
diabetes 6935
 
4.4%
cancer 2294
 
1.4%
heart_disease 1771
 
1.1%
depression, diabetes 835
 
0.5%
chest_pain 799
 
0.5%
seizure_disorder 691
 
0.4%
alcohol_abuse 564
 
0.4%
stroke 542
 
0.3%
Other values (927) 8664
 
5.5%

Length

2023-03-29T09:39:26.654484image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
none 121947
72.7%
depression 17810
 
10.6%
diabetes 9782
 
5.8%
heart_disease 3397
 
2.0%
cancer 3054
 
1.8%
chest_pain 1796
 
1.1%
seizure_disorder 1385
 
0.8%
alcohol_abuse 1383
 
0.8%
chronic_kidney_disease 1272
 
0.8%
stroke 1270
 
0.8%
Other values (12) 4567
 
2.7%

Most occurring characters

ValueCountFrequency (%)
n 271369
28.0%
e 206657
21.3%
o 148985
15.4%
s 67588
 
7.0%
i 43582
 
4.5%
d 37651
 
3.9%
r 35399
 
3.7%
a 30283
 
3.1%
p 22564
 
2.3%
t 18361
 
1.9%
Other values (14) 85753
 
8.9%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 937717
96.9%
Connector Punctuation 12639
 
1.3%
Other Punctuation 8918
 
0.9%
Space Separator 8918
 
0.9%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
n 271369
28.9%
e 206657
22.0%
o 148985
15.9%
s 67588
 
7.2%
i 43582
 
4.6%
d 37651
 
4.0%
r 35399
 
3.8%
a 30283
 
3.2%
p 22564
 
2.4%
t 18361
 
2.0%
Other values (11) 55278
 
5.9%
Connector Punctuation
ValueCountFrequency (%)
_ 12639
100.0%
Other Punctuation
ValueCountFrequency (%)
, 8918
100.0%
Space Separator
ValueCountFrequency (%)
8918
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 937717
96.9%
Common 30475
 
3.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
n 271369
28.9%
e 206657
22.0%
o 148985
15.9%
s 67588
 
7.2%
i 43582
 
4.6%
d 37651
 
4.0%
r 35399
 
3.8%
a 30283
 
3.2%
p 22564
 
2.4%
t 18361
 
2.0%
Other values (11) 55278
 
5.9%
Common
ValueCountFrequency (%)
_ 12639
41.5%
, 8918
29.3%
8918
29.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 968192
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
n 271369
28.0%
e 206657
21.3%
o 148985
15.4%
s 67588
 
7.0%
i 43582
 
4.5%
d 37651
 
3.9%
r 35399
 
3.7%
a 30283
 
3.1%
p 22564
 
2.3%
t 18361
 
1.9%
Other values (14) 85753
 
8.9%
Distinct30
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.4 MiB
none
157882 
other
 
274
grand_mal
 
215
unknown
 
175
petit_mal
 
70
Other values (25)
 
129

Length

Max length33
Median length4
Mean length4.0240071
Min length4

Characters and Unicode

Total characters638791
Distinct characters19
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique8 ?
Unique (%)< 0.1%

Sample

1st rownone
2nd rownone
3rd rownone
4th rownone
5th rownone

Common Values

ValueCountFrequency (%)
none 157882
99.5%
other 274
 
0.2%
grand_mal 215
 
0.1%
unknown 175
 
0.1%
petit_mal 70
 
< 0.1%
grand_mal, petit_mal 27
 
< 0.1%
drug 19
 
< 0.1%
grand_mal, other 16
 
< 0.1%
other, unknown 15
 
< 0.1%
grand_mal, petit_mal, other 6
 
< 0.1%
Other values (20) 46
 
< 0.1%

Length

2023-03-29T09:39:26.780954image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
none 157882
99.4%
other 338
 
0.2%
grand_mal 291
 
0.2%
unknown 207
 
0.1%
petit_mal 121
 
0.1%
drug 38
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
n 316676
49.6%
o 158427
24.8%
e 158341
24.8%
a 703
 
0.1%
r 667
 
0.1%
t 580
 
0.1%
l 412
 
0.1%
m 412
 
0.1%
_ 412
 
0.1%
h 338
 
0.1%
Other values (9) 1823
 
0.3%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 638115
99.9%
Connector Punctuation 412
 
0.1%
Other Punctuation 132
 
< 0.1%
Space Separator 132
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
n 316676
49.6%
o 158427
24.8%
e 158341
24.8%
a 703
 
0.1%
r 667
 
0.1%
t 580
 
0.1%
l 412
 
0.1%
m 412
 
0.1%
h 338
 
0.1%
d 329
 
0.1%
Other values (6) 1230
 
0.2%
Connector Punctuation
ValueCountFrequency (%)
_ 412
100.0%
Other Punctuation
ValueCountFrequency (%)
, 132
100.0%
Space Separator
ValueCountFrequency (%)
132
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 638115
99.9%
Common 676
 
0.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
n 316676
49.6%
o 158427
24.8%
e 158341
24.8%
a 703
 
0.1%
r 667
 
0.1%
t 580
 
0.1%
l 412
 
0.1%
m 412
 
0.1%
h 338
 
0.1%
d 329
 
0.1%
Other values (6) 1230
 
0.2%
Common
ValueCountFrequency (%)
_ 412
60.9%
, 132
 
19.5%
132
 
19.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 638791
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
n 316676
49.6%
o 158427
24.8%
e 158341
24.8%
a 703
 
0.1%
r 667
 
0.1%
t 580
 
0.1%
l 412
 
0.1%
m 412
 
0.1%
_ 412
 
0.1%
h 338
 
0.1%
Other values (9) 1823
 
0.3%

stroke_count
Real number (ℝ)

SKEWED  ZEROS 

Distinct12
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.0030300167
Minimum0
Maximum27
Zeros158446
Zeros (%)99.8%
Negative0
Negative (%)0.0%
Memory size2.4 MiB
2023-03-29T09:39:26.875980image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile0
Maximum27
Range27
Interquartile range (IQR)0

Descriptive statistics

Standard deviation0.11149993
Coefficient of variation (CV)36.798454
Kurtosis24843.017
Mean0.0030300167
Median Absolute Deviation (MAD)0
Skewness124.26003
Sum481
Variance0.012432234
MonotonicityNot monotonic
2023-03-29T09:39:26.979266image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=12)
ValueCountFrequency (%)
0 158446
99.8%
1 218
 
0.1%
2 50
 
< 0.1%
3 15
 
< 0.1%
4 7
 
< 0.1%
5 3
 
< 0.1%
10 1
 
< 0.1%
9 1
 
< 0.1%
6 1
 
< 0.1%
8 1
 
< 0.1%
Other values (2) 2
 
< 0.1%
ValueCountFrequency (%)
0 158446
99.8%
1 218
 
0.1%
2 50
 
< 0.1%
3 15
 
< 0.1%
4 7
 
< 0.1%
5 3
 
< 0.1%
6 1
 
< 0.1%
8 1
 
< 0.1%
9 1
 
< 0.1%
10 1
 
< 0.1%
ValueCountFrequency (%)
27 1
 
< 0.1%
15 1
 
< 0.1%
10 1
 
< 0.1%
9 1
 
< 0.1%
8 1
 
< 0.1%
6 1
 
< 0.1%
5 3
 
< 0.1%
4 7
 
< 0.1%
3 15
 
< 0.1%
2 50
< 0.1%

stroke_diagnosis
Categorical

Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.4 MiB
none
158665 
mini-stroke or tia
 
41
stroke
 
25
unknown
 
14

Length

Max length18
Median length4
Mean length4.0041954
Min length4

Characters and Unicode

Total characters635646
Distinct characters14
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rownone
2nd rownone
3rd rownone
4th rownone
5th rownone

Common Values

ValueCountFrequency (%)
none 158665
99.9%
mini-stroke or tia 41
 
< 0.1%
stroke 25
 
< 0.1%
unknown 14
 
< 0.1%

Length

2023-03-29T09:39:27.140007image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-03-29T09:39:27.282329image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
ValueCountFrequency (%)
none 158665
99.9%
mini-stroke 41
 
< 0.1%
or 41
 
< 0.1%
tia 41
 
< 0.1%
stroke 25
 
< 0.1%
unknown 14
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
n 317413
49.9%
o 158786
25.0%
e 158731
25.0%
i 123
 
< 0.1%
t 107
 
< 0.1%
r 107
 
< 0.1%
82
 
< 0.1%
k 80
 
< 0.1%
s 66
 
< 0.1%
m 41
 
< 0.1%
Other values (4) 110
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 635523
> 99.9%
Space Separator 82
 
< 0.1%
Dash Punctuation 41
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
n 317413
49.9%
o 158786
25.0%
e 158731
25.0%
i 123
 
< 0.1%
t 107
 
< 0.1%
r 107
 
< 0.1%
k 80
 
< 0.1%
s 66
 
< 0.1%
m 41
 
< 0.1%
a 41
 
< 0.1%
Other values (2) 28
 
< 0.1%
Space Separator
ValueCountFrequency (%)
82
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 41
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 635523
> 99.9%
Common 123
 
< 0.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
n 317413
49.9%
o 158786
25.0%
e 158731
25.0%
i 123
 
< 0.1%
t 107
 
< 0.1%
r 107
 
< 0.1%
k 80
 
< 0.1%
s 66
 
< 0.1%
m 41
 
< 0.1%
a 41
 
< 0.1%
Other values (2) 28
 
< 0.1%
Common
ValueCountFrequency (%)
82
66.7%
- 41
33.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 635646
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
n 317413
49.9%
o 158786
25.0%
e 158731
25.0%
i 123
 
< 0.1%
t 107
 
< 0.1%
r 107
 
< 0.1%
82
 
< 0.1%
k 80
 
< 0.1%
s 66
 
< 0.1%
m 41
 
< 0.1%
Other values (4) 110
 
< 0.1%
Distinct9
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.4 MiB
none
158127 
stroke
 
277
"mini-stroke or tia"
 
272
unknown
 
41
stroke, "mini-stroke or tia"
 
16
Other values (4)
 
12

Length

Max length29
Median length4
Mean length4.0356169
Min length4

Characters and Unicode

Total characters640634
Distinct characters16
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st rownone
2nd rownone
3rd rownone
4th rownone
5th rownone

Common Values

ValueCountFrequency (%)
none 158127
99.6%
stroke 277
 
0.2%
"mini-stroke or tia" 272
 
0.2%
unknown 41
 
< 0.1%
stroke, "mini-stroke or tia" 16
 
< 0.1%
mini-stroke or tia 5
 
< 0.1%
"mini-stroke or tia", stroke 4
 
< 0.1%
"mini-stroke or tia", unknown 2
 
< 0.1%
unknown, "mini-stroke or tia" 1
 
< 0.1%

Length

2023-03-29T09:39:27.383989image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-03-29T09:39:27.511199image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
ValueCountFrequency (%)
none 158127
99.2%
mini-stroke 300
 
0.2%
or 300
 
0.2%
tia 300
 
0.2%
stroke 297
 
0.2%
unknown 44
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
n 316686
49.4%
o 159068
24.8%
e 158724
24.8%
i 900
 
0.1%
t 897
 
0.1%
r 897
 
0.1%
k 641
 
0.1%
623
 
0.1%
s 597
 
0.1%
" 590
 
0.1%
Other values (6) 1011
 
0.2%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 639098
99.8%
Space Separator 623
 
0.1%
Other Punctuation 613
 
0.1%
Dash Punctuation 300
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
n 316686
49.6%
o 159068
24.9%
e 158724
24.8%
i 900
 
0.1%
t 897
 
0.1%
r 897
 
0.1%
k 641
 
0.1%
s 597
 
0.1%
m 300
 
< 0.1%
a 300
 
< 0.1%
Other values (2) 88
 
< 0.1%
Other Punctuation
ValueCountFrequency (%)
" 590
96.2%
, 23
 
3.8%
Space Separator
ValueCountFrequency (%)
623
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 300
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 639098
99.8%
Common 1536
 
0.2%

Most frequent character per script

Latin
ValueCountFrequency (%)
n 316686
49.6%
o 159068
24.9%
e 158724
24.8%
i 900
 
0.1%
t 897
 
0.1%
r 897
 
0.1%
k 641
 
0.1%
s 597
 
0.1%
m 300
 
< 0.1%
a 300
 
< 0.1%
Other values (2) 88
 
< 0.1%
Common
ValueCountFrequency (%)
623
40.6%
" 590
38.4%
- 300
19.5%
, 23
 
1.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 640634
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
n 316686
49.4%
o 159068
24.8%
e 158724
24.8%
i 900
 
0.1%
t 897
 
0.1%
r 897
 
0.1%
k 641
 
0.1%
623
 
0.1%
s 597
 
0.1%
" 590
 
0.1%
Other values (6) 1011
 
0.2%
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.4 MiB
False
157216 
True
 
1529
ValueCountFrequency (%)
False 157216
99.0%
True 1529
 
1.0%
2023-03-29T09:39:28.026445image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

test_proc_type
Categorical

HIGH CARDINALITY  IMBALANCE 

Distinct226
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size2.4 MiB
none
157302 
other
 
565
blood_work
 
118
colonoscopy
 
110
bone_joint_imaging
 
55
Other values (221)
 
595

Length

Max length150
Median length4
Mean length4.1097861
Min length3

Characters and Unicode

Total characters652408
Distinct characters25
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique159 ?
Unique (%)0.1%

Sample

1st rownone
2nd rownone
3rd rownone
4th rownone
5th rowmammogram, ekg

Common Values

ValueCountFrequency (%)
none 157302
99.1%
other 565
 
0.4%
blood_work 118
 
0.1%
colonoscopy 110
 
0.1%
bone_joint_imaging 55
 
< 0.1%
blood_work, other 51
 
< 0.1%
mammogram 40
 
< 0.1%
colonoscopy, other 24
 
< 0.1%
blood_work, urine_testing 23
 
< 0.1%
ekg 17
 
< 0.1%
Other values (216) 440
 
0.3%

Length

2023-03-29T09:39:28.207120image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
none 157302
98.5%
other 776
 
0.5%
blood_work 470
 
0.3%
colonoscopy 235
 
0.1%
mammogram 155
 
0.1%
bone_joint_imaging 152
 
0.1%
urine_testing 130
 
0.1%
pap_smear 120
 
0.1%
vision_hearing 110
 
0.1%
ekg 103
 
0.1%
Other values (3) 119
 
0.1%

Most occurring characters

ValueCountFrequency (%)
n 315969
48.4%
o 161029
24.7%
e 159018
24.4%
r 1848
 
0.3%
t 1308
 
0.2%
_ 1210
 
0.2%
i 1122
 
0.2%
g 933
 
0.1%
927
 
0.1%
, 927
 
0.1%
Other values (15) 8117
 
1.2%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 649344
99.5%
Connector Punctuation 1210
 
0.2%
Space Separator 927
 
0.1%
Other Punctuation 927
 
0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
n 315969
48.7%
o 161029
24.8%
e 159018
24.5%
r 1848
 
0.3%
t 1308
 
0.2%
i 1122
 
0.2%
g 933
 
0.1%
a 899
 
0.1%
m 892
 
0.1%
h 886
 
0.1%
Other values (12) 5440
 
0.8%
Connector Punctuation
ValueCountFrequency (%)
_ 1210
100.0%
Space Separator
ValueCountFrequency (%)
927
100.0%
Other Punctuation
ValueCountFrequency (%)
, 927
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 649344
99.5%
Common 3064
 
0.5%

Most frequent character per script

Latin
ValueCountFrequency (%)
n 315969
48.7%
o 161029
24.8%
e 159018
24.5%
r 1848
 
0.3%
t 1308
 
0.2%
i 1122
 
0.2%
g 933
 
0.1%
a 899
 
0.1%
m 892
 
0.1%
h 886
 
0.1%
Other values (12) 5440
 
0.8%
Common
ValueCountFrequency (%)
_ 1210
39.5%
927
30.3%
, 927
30.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 652408
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
n 315969
48.4%
o 161029
24.7%
e 159018
24.4%
r 1848
 
0.3%
t 1308
 
0.2%
_ 1210
 
0.2%
i 1122
 
0.2%
g 933
 
0.1%
927
 
0.1%
, 927
 
0.1%
Other values (15) 8117
 
1.2%
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.4 MiB
True
158264 
False
 
481
ValueCountFrequency (%)
True 158264
99.7%
False 481
 
0.3%
2023-03-29T09:39:28.380202image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

criminal_history
Categorical

Distinct16
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.4 MiB
none
142080 
misdemeanor
 
5660
felony
 
5244
dui
 
3195
misdemeanor, felony
 
923
Other values (11)
 
1643

Length

Max length24
Median length4
Mean length4.5233803
Min length3

Characters and Unicode

Total characters718064
Distinct characters15
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowfelony
2nd rowmisdemeanor, felony
3rd rownone
4th rownone
5th rownone

Common Values

ValueCountFrequency (%)
none 142080
89.5%
misdemeanor 5660
 
3.6%
felony 5244
 
3.3%
dui 3195
 
2.0%
misdemeanor, felony 923
 
0.6%
felony, misdemeanor 504
 
0.3%
misdemeanor, dui 444
 
0.3%
felony, dui 228
 
0.1%
misdemeanor, felony, dui 184
 
0.1%
dui, misdemeanor 106
 
0.1%
Other values (6) 177
 
0.1%

Length

2023-03-29T09:39:28.512032image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
none 142080
87.9%
misdemeanor 7948
 
4.9%
felony 7260
 
4.5%
dui 4334
 
2.7%

Most occurring characters

ValueCountFrequency (%)
n 299368
41.7%
e 165236
23.0%
o 157288
21.9%
m 15896
 
2.2%
i 12282
 
1.7%
d 12282
 
1.7%
s 7948
 
1.1%
a 7948
 
1.1%
r 7948
 
1.1%
f 7260
 
1.0%
Other values (5) 24608
 
3.4%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 712310
99.2%
Other Punctuation 2877
 
0.4%
Space Separator 2877
 
0.4%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
n 299368
42.0%
e 165236
23.2%
o 157288
22.1%
m 15896
 
2.2%
i 12282
 
1.7%
d 12282
 
1.7%
s 7948
 
1.1%
a 7948
 
1.1%
r 7948
 
1.1%
f 7260
 
1.0%
Other values (3) 18854
 
2.6%
Other Punctuation
ValueCountFrequency (%)
, 2877
100.0%
Space Separator
ValueCountFrequency (%)
2877
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 712310
99.2%
Common 5754
 
0.8%

Most frequent character per script

Latin
ValueCountFrequency (%)
n 299368
42.0%
e 165236
23.2%
o 157288
22.1%
m 15896
 
2.2%
i 12282
 
1.7%
d 12282
 
1.7%
s 7948
 
1.1%
a 7948
 
1.1%
r 7948
 
1.1%
f 7260
 
1.0%
Other values (3) 18854
 
2.6%
Common
ValueCountFrequency (%)
, 2877
50.0%
2877
50.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 718064
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
n 299368
41.7%
e 165236
23.0%
o 157288
21.9%
m 15896
 
2.2%
i 12282
 
1.7%
d 12282
 
1.7%
s 7948
 
1.1%
a 7948
 
1.1%
r 7948
 
1.1%
f 7260
 
1.0%
Other values (5) 24608
 
3.4%
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.4 MiB
False
158551 
True
 
194
ValueCountFrequency (%)
False 158551
99.9%
True 194
 
0.1%
2023-03-29T09:39:28.607104image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

disability_pmts_reason
Categorical

HIGH CARDINALITY  IMBALANCE 

Distinct88
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size2.4 MiB
none
143645 
ssdi
 
7988
vadi
 
2239
long_term
 
1140
short_term
 
947
Other values (83)
 
2786

Length

Max length58
Median length4
Mean length4.201052
Min length4

Characters and Unicode

Total characters666896
Distinct characters22
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique34 ?
Unique (%)< 0.1%

Sample

1st rownone
2nd rowssdi
3rd rowssdi
4th rownone
5th rowssdi

Common Values

ValueCountFrequency (%)
none 143645
90.5%
ssdi 7988
 
5.0%
vadi 2239
 
1.4%
long_term 1140
 
0.7%
short_term 947
 
0.6%
other 734
 
0.5%
long_term, ssdi 417
 
0.3%
ssdi, vadi 366
 
0.2%
maternity 289
 
0.2%
workers_comp 203
 
0.1%
Other values (78) 777
 
0.5%

Length

2023-03-29T09:39:28.708799image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
none 143645
89.5%
ssdi 9271
 
5.8%
vadi 2844
 
1.8%
long_term 1904
 
1.2%
short_term 1173
 
0.7%
other 916
 
0.6%
maternity 352
 
0.2%
workers_comp 323
 
0.2%

Most occurring characters

ValueCountFrequency (%)
n 289546
43.4%
e 148313
22.2%
o 148284
22.2%
s 20038
 
3.0%
i 12467
 
1.9%
d 12115
 
1.8%
r 6164
 
0.9%
t 5870
 
0.9%
m 3752
 
0.6%
_ 3400
 
0.5%
Other values (12) 16947
 
2.5%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 660130
99.0%
Connector Punctuation 3400
 
0.5%
Other Punctuation 1683
 
0.3%
Space Separator 1683
 
0.3%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
n 289546
43.9%
e 148313
22.5%
o 148284
22.5%
s 20038
 
3.0%
i 12467
 
1.9%
d 12115
 
1.8%
r 6164
 
0.9%
t 5870
 
0.9%
m 3752
 
0.6%
a 3196
 
0.5%
Other values (9) 10385
 
1.6%
Connector Punctuation
ValueCountFrequency (%)
_ 3400
100.0%
Other Punctuation
ValueCountFrequency (%)
, 1683
100.0%
Space Separator
ValueCountFrequency (%)
1683
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 660130
99.0%
Common 6766
 
1.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
n 289546
43.9%
e 148313
22.5%
o 148284
22.5%
s 20038
 
3.0%
i 12467
 
1.9%
d 12115
 
1.8%
r 6164
 
0.9%
t 5870
 
0.9%
m 3752
 
0.6%
a 3196
 
0.5%
Other values (9) 10385
 
1.6%
Common
ValueCountFrequency (%)
_ 3400
50.3%
, 1683
24.9%
1683
24.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII 666896
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
n 289546
43.4%
e 148313
22.2%
o 148284
22.2%
s 20038
 
3.0%
i 12467
 
1.9%
d 12115
 
1.8%
r 6164
 
0.9%
t 5870
 
0.9%
m 3752
 
0.6%
_ 3400
 
0.5%
Other values (12) 16947
 
2.5%

dui_count
Real number (ℝ)

SKEWED  ZEROS 

Distinct16
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.034325491
Minimum0
Maximum25
Zeros154764
Zeros (%)97.5%
Negative0
Negative (%)0.0%
Memory size2.4 MiB
2023-03-29T09:39:28.850816image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile0
Maximum25
Range25
Interquartile range (IQR)0

Descriptive statistics

Standard deviation0.28000485
Coefficient of variation (CV)8.1573443
Kurtosis1485.6139
Mean0.034325491
Median Absolute Deviation (MAD)0
Skewness25.836988
Sum5449
Variance0.078402714
MonotonicityNot monotonic
2023-03-29T09:39:29.023160image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=16)
ValueCountFrequency (%)
0 154764
97.5%
1 3056
 
1.9%
2 684
 
0.4%
3 161
 
0.1%
4 35
 
< 0.1%
6 12
 
< 0.1%
5 11
 
< 0.1%
7 4
 
< 0.1%
20 4
 
< 0.1%
10 4
 
< 0.1%
Other values (6) 10
 
< 0.1%
ValueCountFrequency (%)
0 154764
97.5%
1 3056
 
1.9%
2 684
 
0.4%
3 161
 
0.1%
4 35
 
< 0.1%
5 11
 
< 0.1%
6 12
 
< 0.1%
7 4
 
< 0.1%
8 2
 
< 0.1%
9 2
 
< 0.1%
ValueCountFrequency (%)
25 1
 
< 0.1%
22 1
 
< 0.1%
20 4
 
< 0.1%
12 2
 
< 0.1%
11 2
 
< 0.1%
10 4
 
< 0.1%
9 2
 
< 0.1%
8 2
 
< 0.1%
7 4
 
< 0.1%
6 12
< 0.1%
Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.4 MiB
none
158470 
no
 
223
yes
 
36
unknown
 
16

Length

Max length7
Median length4
Mean length3.9972661
Min length2

Characters and Unicode

Total characters634546
Distinct characters8
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rownone
2nd rownone
3rd rownone
4th rownone
5th rownone

Common Values

ValueCountFrequency (%)
none 158470
99.8%
no 223
 
0.1%
yes 36
 
< 0.1%
unknown 16
 
< 0.1%

Length

2023-03-29T09:39:29.140396image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-03-29T09:39:29.276656image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
ValueCountFrequency (%)
none 158470
99.8%
no 223
 
0.1%
yes 36
 
< 0.1%
unknown 16
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
n 317211
50.0%
o 158709
25.0%
e 158506
25.0%
y 36
 
< 0.1%
s 36
 
< 0.1%
u 16
 
< 0.1%
k 16
 
< 0.1%
w 16
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 634546
100.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
n 317211
50.0%
o 158709
25.0%
e 158506
25.0%
y 36
 
< 0.1%
s 36
 
< 0.1%
u 16
 
< 0.1%
k 16
 
< 0.1%
w 16
 
< 0.1%

Most occurring scripts

ValueCountFrequency (%)
Latin 634546
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
n 317211
50.0%
o 158709
25.0%
e 158506
25.0%
y 36
 
< 0.1%
s 36
 
< 0.1%
u 16
 
< 0.1%
k 16
 
< 0.1%
w 16
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 634546
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
n 317211
50.0%
o 158709
25.0%
e 158506
25.0%
y 36
 
< 0.1%
s 36
 
< 0.1%
u 16
 
< 0.1%
k 16
 
< 0.1%
w 16
 
< 0.1%
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.4 MiB
False
151215 
True
 
7530
ValueCountFrequency (%)
False 151215
95.3%
True 7530
 
4.7%
2023-03-29T09:39:29.368093image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Distinct6
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.4 MiB
none
158682 
extended_travel
 
57
AF
 
2
IQ
 
2
AF, IQ
 
1

Length

Max length23
Median length4
Mean length4.0040316
Min length2

Characters and Unicode

Total characters635620
Distinct characters17
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2 ?
Unique (%)< 0.1%

Sample

1st rownone
2nd rownone
3rd rownone
4th rownone
5th rownone

Common Values

ValueCountFrequency (%)
none 158682
> 99.9%
extended_travel 57
 
< 0.1%
AF 2
 
< 0.1%
IQ 2
 
< 0.1%
AF, IQ 1
 
< 0.1%
AF, IQ, extended_travel 1
 
< 0.1%

Length

2023-03-29T09:39:29.462229image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-03-29T09:39:29.642147image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
ValueCountFrequency (%)
none 158682
> 99.9%
extended_travel 58
 
< 0.1%
af 4
 
< 0.1%
iq 4
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
n 317422
49.9%
e 158914
25.0%
o 158682
25.0%
t 116
 
< 0.1%
d 116
 
< 0.1%
l 58
 
< 0.1%
v 58
 
< 0.1%
a 58
 
< 0.1%
r 58
 
< 0.1%
_ 58
 
< 0.1%
Other values (7) 80
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 635540
> 99.9%
Connector Punctuation 58
 
< 0.1%
Uppercase Letter 16
 
< 0.1%
Other Punctuation 3
 
< 0.1%
Space Separator 3
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
n 317422
49.9%
e 158914
25.0%
o 158682
25.0%
t 116
 
< 0.1%
d 116
 
< 0.1%
l 58
 
< 0.1%
v 58
 
< 0.1%
a 58
 
< 0.1%
r 58
 
< 0.1%
x 58
 
< 0.1%
Uppercase Letter
ValueCountFrequency (%)
A 4
25.0%
F 4
25.0%
I 4
25.0%
Q 4
25.0%
Connector Punctuation
ValueCountFrequency (%)
_ 58
100.0%
Other Punctuation
ValueCountFrequency (%)
, 3
100.0%
Space Separator
ValueCountFrequency (%)
3
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 635556
> 99.9%
Common 64
 
< 0.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
n 317422
49.9%
e 158914
25.0%
o 158682
25.0%
t 116
 
< 0.1%
d 116
 
< 0.1%
l 58
 
< 0.1%
v 58
 
< 0.1%
a 58
 
< 0.1%
r 58
 
< 0.1%
x 58
 
< 0.1%
Other values (4) 16
 
< 0.1%
Common
ValueCountFrequency (%)
_ 58
90.6%
, 3
 
4.7%
3
 
4.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII 635620
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
n 317422
49.9%
e 158914
25.0%
o 158682
25.0%
t 116
 
< 0.1%
d 116
 
< 0.1%
l 58
 
< 0.1%
v 58
 
< 0.1%
a 58
 
< 0.1%
r 58
 
< 0.1%
_ 58
 
< 0.1%
Other values (7) 80
 
< 0.1%
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.4 MiB
False
152813 
True
 
5932
ValueCountFrequency (%)
False 152813
96.3%
True 5932
 
3.7%
2023-03-29T09:39:29.777232image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

mental_health_diagnosis
Categorical

HIGH CARDINALITY  IMBALANCE 

Distinct793
Distinct (%)0.5%
Missing0
Missing (%)0.0%
Memory size2.4 MiB
none
139887 
depression, anxiety
 
2919
anxiety
 
2907
depression
 
2804
depression, anxiety, ptsd
 
977
Other values (788)
 
9251

Length

Max length94
Median length4
Mean length5.6466849
Min length3

Characters and Unicode

Total characters896383
Distinct characters21
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique449 ?
Unique (%)0.3%

Sample

1st rownone
2nd rownone
3rd rownone
4th rowdepression, anxiety, add
5th rowbipolar

Common Values

ValueCountFrequency (%)
none 139887
88.1%
depression, anxiety 2919
 
1.8%
anxiety 2907
 
1.8%
depression 2804
 
1.8%
depression, anxiety, ptsd 977
 
0.6%
bipolar 789
 
0.5%
ptsd 588
 
0.4%
depression, anxiety, bipolar 520
 
0.3%
anxiety, depression 460
 
0.3%
depression, anxiety, ptsd, bipolar 372
 
0.2%
Other values (783) 6522
 
4.1%

Length

2023-03-29T09:39:29.901926image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
none 139887
77.7%
depression 12432
 
6.9%
anxiety 12426
 
6.9%
ptsd 4769
 
2.6%
bipolar 4240
 
2.4%
add 2786
 
1.5%
ocd 1120
 
0.6%
other 917
 
0.5%
schizophrenia 717
 
0.4%
psychosis 401
 
0.2%

Most occurring characters

ValueCountFrequency (%)
n 305681
34.1%
e 179475
20.0%
o 160046
17.9%
s 31885
 
3.6%
i 31597
 
3.5%
d 24557
 
2.7%
p 22559
 
2.5%
21282
 
2.4%
, 21282
 
2.4%
a 20501
 
2.3%
Other values (11) 77518
 
8.6%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 853487
95.2%
Space Separator 21282
 
2.4%
Other Punctuation 21282
 
2.4%
Connector Punctuation 332
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
n 305681
35.8%
e 179475
21.0%
o 160046
18.8%
s 31885
 
3.7%
i 31597
 
3.7%
d 24557
 
2.9%
p 22559
 
2.6%
a 20501
 
2.4%
r 18970
 
2.2%
t 18444
 
2.2%
Other values (8) 39772
 
4.7%
Space Separator
ValueCountFrequency (%)
21282
100.0%
Other Punctuation
ValueCountFrequency (%)
, 21282
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 332
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 853487
95.2%
Common 42896
 
4.8%

Most frequent character per script

Latin
ValueCountFrequency (%)
n 305681
35.8%
e 179475
21.0%
o 160046
18.8%
s 31885
 
3.7%
i 31597
 
3.7%
d 24557
 
2.9%
p 22559
 
2.6%
a 20501
 
2.4%
r 18970
 
2.2%
t 18444
 
2.2%
Other values (8) 39772
 
4.7%
Common
ValueCountFrequency (%)
21282
49.6%
, 21282
49.6%
_ 332
 
0.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII 896383
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
n 305681
34.1%
e 179475
20.0%
o 160046
17.9%
s 31885
 
3.6%
i 31597
 
3.5%
d 24557
 
2.7%
p 22559
 
2.5%
21282
 
2.4%
, 21282
 
2.4%
a 20501
 
2.3%
Other values (11) 77518
 
8.6%
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.4 MiB
False
158172 
True
 
573
ValueCountFrequency (%)
False 158172
99.6%
True 573
 
0.4%
2023-03-29T09:39:29.996313image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.4 MiB
False
158661 
True
 
84
ValueCountFrequency (%)
False 158661
99.9%
True 84
 
0.1%
2023-03-29T09:39:30.127863image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.4 MiB
False
158494 
True
 
251
ValueCountFrequency (%)
False 158494
99.8%
True 251
 
0.2%
2023-03-29T09:39:30.206778image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.4 MiB
True
158448 
False
 
297
ValueCountFrequency (%)
True 158448
99.8%
False 297
 
0.2%
2023-03-29T09:39:30.308115image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.4 MiB
False
158641 
True
 
104
ValueCountFrequency (%)
False 158641
99.9%
True 104
 
0.1%
2023-03-29T09:39:30.403429image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.4 MiB
False
158596 
True
 
149
ValueCountFrequency (%)
False 158596
99.9%
True 149
 
0.1%
2023-03-29T09:39:30.497791image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.4 MiB
False
158740 
True
 
5
ValueCountFrequency (%)
False 158740
> 99.9%
True 5
 
< 0.1%
2023-03-29T09:39:30.607467image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

tb
Boolean

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.4 MiB
False
138332 
True
20413 
ValueCountFrequency (%)
False 138332
87.1%
True 20413
 
12.9%
2023-03-29T09:39:30.725399image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.4 MiB
False
139588 
True
19157 
ValueCountFrequency (%)
False 139588
87.9%
True 19157
 
12.1%
2023-03-29T09:39:30.840678image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

weight_loss_amount
Real number (ℝ)

SKEWED  ZEROS 

Distinct241
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3.1881067
Minimum0
Maximum7403
Zeros139638
Zeros (%)88.0%
Negative0
Negative (%)0.0%
Memory size2.4 MiB
2023-03-29T09:39:30.935055image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile20
Maximum7403
Range7403
Interquartile range (IQR)0

Descriptive statistics

Standard deviation27.034006
Coefficient of variation (CV)8.4796426
Kurtosis39180.065
Mean3.1881067
Median Absolute Deviation (MAD)0
Skewness160.88351
Sum506096
Variance730.83746
MonotonicityNot monotonic
2023-03-29T09:39:31.092465image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 139638
88.0%
15 3539
 
2.2%
20 2986
 
1.9%
10 2412
 
1.5%
12 1508
 
0.9%
30 1387
 
0.9%
25 1209
 
0.8%
40 648
 
0.4%
11 623
 
0.4%
50 481
 
0.3%
Other values (231) 4314
 
2.7%
ValueCountFrequency (%)
0 139638
88.0%
1 6
 
< 0.1%
2 24
 
< 0.1%
3 23
 
< 0.1%
4 21
 
< 0.1%
5 238
 
0.1%
6 66
 
< 0.1%
7 59
 
< 0.1%
8 154
 
0.1%
9 53
 
< 0.1%
ValueCountFrequency (%)
7403 1
 
< 0.1%
4047 1
 
< 0.1%
2248 1
 
< 0.1%
1900 2
< 0.1%
1192 1
 
< 0.1%
1000 2
< 0.1%
511 1
 
< 0.1%
500 3
< 0.1%
430 1
 
< 0.1%
423 1
 
< 0.1%

weight_loss_reason
Categorical

HIGH CARDINALITY  IMBALANCE 

Distinct166
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size2.4 MiB
none
151735 
diet, exercise
 
2548
diet
 
782
other
 
673
exercise
 
469
Other values (161)
 
2538

Length

Max length73
Median length4
Mean length4.3863933
Min length4

Characters and Unicode

Total characters696318
Distinct characters21
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique78 ?
Unique (%)< 0.1%

Sample

1st rownone
2nd rownone
3rd rownone
4th rowmedication
5th rownone

Common Values

ValueCountFrequency (%)
none 151735
95.6%
diet, exercise 2548
 
1.6%
diet 782
 
0.5%
other 673
 
0.4%
exercise 469
 
0.3%
childbirth 435
 
0.3%
exercise, diet 406
 
0.3%
medical_condition 272
 
0.2%
surgery 254
 
0.2%
diet, exercise, medication 116
 
0.1%
Other values (156) 1055
 
0.7%

Length

2023-03-29T09:39:31.273935image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
none 151735
92.8%
diet 4580
 
2.8%
exercise 4160
 
2.5%
other 1022
 
0.6%
childbirth 659
 
0.4%
medical_condition 509
 
0.3%
surgery 509
 
0.3%
medication 417
 
0.3%

Most occurring characters

ValueCountFrequency (%)
n 304905
43.8%
e 171252
24.6%
o 154192
22.1%
i 12419
 
1.8%
t 7187
 
1.0%
r 6859
 
1.0%
d 6674
 
1.0%
c 6254
 
0.9%
4846
 
0.7%
, 4846
 
0.7%
Other values (11) 16884
 
2.4%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 686117
98.5%
Space Separator 4846
 
0.7%
Other Punctuation 4846
 
0.7%
Connector Punctuation 509
 
0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
n 304905
44.4%
e 171252
25.0%
o 154192
22.5%
i 12419
 
1.8%
t 7187
 
1.0%
r 6859
 
1.0%
d 6674
 
1.0%
c 6254
 
0.9%
s 4669
 
0.7%
x 4160
 
0.6%
Other values (8) 7546
 
1.1%
Space Separator
ValueCountFrequency (%)
4846
100.0%
Other Punctuation
ValueCountFrequency (%)
, 4846
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 509
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 686117
98.5%
Common 10201
 
1.5%

Most frequent character per script

Latin
ValueCountFrequency (%)
n 304905
44.4%
e 171252
25.0%
o 154192
22.5%
i 12419
 
1.8%
t 7187
 
1.0%
r 6859
 
1.0%
d 6674
 
1.0%
c 6254
 
0.9%
s 4669
 
0.7%
x 4160
 
0.6%
Other values (8) 7546
 
1.1%
Common
ValueCountFrequency (%)
4846
47.5%
, 4846
47.5%
_ 509
 
5.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 696318
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
n 304905
43.8%
e 171252
24.6%
o 154192
22.1%
i 12419
 
1.8%
t 7187
 
1.0%
r 6859
 
1.0%
d 6674
 
1.0%
c 6254
 
0.9%
4846
 
0.7%
, 4846
 
0.7%
Other values (11) 16884
 
2.4%

cancer_type
Categorical

Distinct11
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.4 MiB
none
155691 
other
 
2722
basal
 
156
squamous
 
122
basal, other
 
19
Other values (6)
 
35

Length

Max length22
Median length4
Mean length4.0246937
Min length4

Characters and Unicode

Total characters638900
Distinct characters15
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2 ?
Unique (%)< 0.1%

Sample

1st rownone
2nd rownone
3rd rownone
4th rownone
5th rownone

Common Values

ValueCountFrequency (%)
none 155691
98.1%
other 2722
 
1.7%
basal 156
 
0.1%
squamous 122
 
0.1%
basal, other 19
 
< 0.1%
basal, squamous 12
 
< 0.1%
squamous, other 10
 
< 0.1%
other, basal 6
 
< 0.1%
basal, squamous, other 5
 
< 0.1%
squamous, basal 1
 
< 0.1%

Length

2023-03-29T09:39:31.406036image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
none 155691
98.0%
other 2763
 
1.7%
basal 199
 
0.1%
squamous 151
 
0.1%

Most occurring characters

ValueCountFrequency (%)
n 311382
48.7%
o 158605
24.8%
e 158454
24.8%
t 2763
 
0.4%
h 2763
 
0.4%
r 2763
 
0.4%
a 549
 
0.1%
s 501
 
0.1%
u 302
 
< 0.1%
b 199
 
< 0.1%
Other values (5) 619
 
0.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 638782
> 99.9%
Other Punctuation 59
 
< 0.1%
Space Separator 59
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
n 311382
48.7%
o 158605
24.8%
e 158454
24.8%
t 2763
 
0.4%
h 2763
 
0.4%
r 2763
 
0.4%
a 549
 
0.1%
s 501
 
0.1%
u 302
 
< 0.1%
b 199
 
< 0.1%
Other values (3) 501
 
0.1%
Other Punctuation
ValueCountFrequency (%)
, 59
100.0%
Space Separator
ValueCountFrequency (%)
59
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 638782
> 99.9%
Common 118
 
< 0.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
n 311382
48.7%
o 158605
24.8%
e 158454
24.8%
t 2763
 
0.4%
h 2763
 
0.4%
r 2763
 
0.4%
a 549
 
0.1%
s 501
 
0.1%
u 302
 
< 0.1%
b 199
 
< 0.1%
Other values (3) 501
 
0.1%
Common
ValueCountFrequency (%)
, 59
50.0%
59
50.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 638900
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
n 311382
48.7%
o 158605
24.8%
e 158454
24.8%
t 2763
 
0.4%
h 2763
 
0.4%
r 2763
 
0.4%
a 549
 
0.1%
s 501
 
0.1%
u 302
 
< 0.1%
b 199
 
< 0.1%
Other values (5) 619
 
0.1%

citizen
Boolean

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.4 MiB
True
151693 
False
 
7052
ValueCountFrequency (%)
True 151693
95.6%
False 7052
 
4.4%
2023-03-29T09:39:31.563856image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

legal_resident
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.4 MiB
resident
156788 
visa
 
998
none
 
959

Length

Max length8
Median length8
Mean length7.9506882
Min length4

Characters and Unicode

Total characters1262132
Distinct characters10
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowresident
2nd rowresident
3rd rowresident
4th rowresident
5th rowresident

Common Values

ValueCountFrequency (%)
resident 156788
98.8%
visa 998
 
0.6%
none 959
 
0.6%

Length

2023-03-29T09:39:31.695272image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-03-29T09:39:31.838414image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
ValueCountFrequency (%)
resident 156788
98.8%
visa 998
 
0.6%
none 959
 
0.6%

Most occurring characters

ValueCountFrequency (%)
e 314535
24.9%
n 158706
12.6%
s 157786
12.5%
i 157786
12.5%
r 156788
12.4%
d 156788
12.4%
t 156788
12.4%
v 998
 
0.1%
a 998
 
0.1%
o 959
 
0.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 1262132
100.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 314535
24.9%
n 158706
12.6%
s 157786
12.5%
i 157786
12.5%
r 156788
12.4%
d 156788
12.4%
t 156788
12.4%
v 998
 
0.1%
a 998
 
0.1%
o 959
 
0.1%

Most occurring scripts

ValueCountFrequency (%)
Latin 1262132
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 314535
24.9%
n 158706
12.6%
s 157786
12.5%
i 157786
12.5%
r 156788
12.4%
d 156788
12.4%
t 156788
12.4%
v 998
 
0.1%
a 998
 
0.1%
o 959
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1262132
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 314535
24.9%
n 158706
12.6%
s 157786
12.5%
i 157786
12.5%
r 156788
12.4%
d 156788
12.4%
t 156788
12.4%
v 998
 
0.1%
a 998
 
0.1%
o 959
 
0.1%

skydive_count
Real number (ℝ)

SKEWED  ZEROS 

Distinct44
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.15561435
Minimum0
Maximum7800
Zeros157778
Zeros (%)99.4%
Negative0
Negative (%)0.0%
Memory size2.4 MiB
2023-03-29T09:39:31.955262image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile0
Maximum7800
Range7800
Interquartile range (IQR)0

Descriptive statistics

Standard deviation23.420848
Coefficient of variation (CV)150.50571
Kurtosis81310.571
Mean0.15561435
Median Absolute Deviation (MAD)0
Skewness265.03564
Sum24703
Variance548.53611
MonotonicityNot monotonic
2023-03-29T09:39:32.115041image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=44)
ValueCountFrequency (%)
0 157778
99.4%
1 527
 
0.3%
2 188
 
0.1%
3 60
 
< 0.1%
5 54
 
< 0.1%
4 24
 
< 0.1%
10 19
 
< 0.1%
6 13
 
< 0.1%
7 12
 
< 0.1%
50 7
 
< 0.1%
Other values (34) 63
 
< 0.1%
ValueCountFrequency (%)
0 157778
99.4%
1 527
 
0.3%
2 188
 
0.1%
3 60
 
< 0.1%
4 24
 
< 0.1%
5 54
 
< 0.1%
6 13
 
< 0.1%
7 12
 
< 0.1%
8 2
 
< 0.1%
9 1
 
< 0.1%
ValueCountFrequency (%)
7800 1
 
< 0.1%
3000 2
< 0.1%
2040 1
 
< 0.1%
999 3
< 0.1%
700 1
 
< 0.1%
444 1
 
< 0.1%
300 2
< 0.1%
250 1
 
< 0.1%
200 1
 
< 0.1%
165 1
 
< 0.1%

surgery_type
Categorical

HIGH CARDINALITY  IMBALANCE 

Distinct137
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size2.4 MiB
none
150433 
other
 
4053
bone_joint
 
2337
csection
 
440
gallbladder
 
393
Other values (132)
 
1089

Length

Max length112
Median length4
Mean length4.2204731
Min length4

Characters and Unicode

Total characters669979
Distinct characters21
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique76 ?
Unique (%)< 0.1%

Sample

1st rownone
2nd rownone
3rd rownone
4th rownone
5th rownone

Common Values

ValueCountFrequency (%)
none 150433
94.8%
other 4053
 
2.6%
bone_joint 2337
 
1.5%
csection 440
 
0.3%
gallbladder 393
 
0.2%
dental 151
 
0.1%
cosmetic 128
 
0.1%
vision_hearing 118
 
0.1%
bone_joint, other 93
 
0.1%
hemorrhoid 58
 
< 0.1%
Other values (127) 541
 
0.3%

Length

2023-03-29T09:39:32.340962image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
none 150433
94.4%
other 4397
 
2.8%
bone_joint 2608
 
1.6%
csection 534
 
0.3%
gallbladder 486
 
0.3%
dental 337
 
0.2%
cosmetic 194
 
0.1%
vision_hearing 184
 
0.1%
hemorrhoid 96
 
0.1%
tonsil_adenoid 83
 
0.1%
Other values (2) 78
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
n 307531
45.9%
o 161394
24.1%
e 159474
23.8%
t 8187
 
1.2%
r 5303
 
0.8%
h 4773
 
0.7%
i 4238
 
0.6%
b 3094
 
0.5%
_ 2919
 
0.4%
j 2608
 
0.4%
Other values (11) 10458
 
1.6%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 665690
99.4%
Connector Punctuation 2919
 
0.4%
Other Punctuation 685
 
0.1%
Space Separator 685
 
0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
n 307531
46.2%
o 161394
24.2%
e 159474
24.0%
t 8187
 
1.2%
r 5303
 
0.8%
h 4773
 
0.7%
i 4238
 
0.6%
b 3094
 
0.5%
j 2608
 
0.4%
l 1878
 
0.3%
Other values (8) 7210
 
1.1%
Connector Punctuation
ValueCountFrequency (%)
_ 2919
100.0%
Other Punctuation
ValueCountFrequency (%)
, 685
100.0%
Space Separator
ValueCountFrequency (%)
685
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 665690
99.4%
Common 4289
 
0.6%

Most frequent character per script

Latin
ValueCountFrequency (%)
n 307531
46.2%
o 161394
24.2%
e 159474
24.0%
t 8187
 
1.2%
r 5303
 
0.8%
h 4773
 
0.7%
i 4238
 
0.6%
b 3094
 
0.5%
j 2608
 
0.4%
l 1878
 
0.3%
Other values (8) 7210
 
1.1%
Common
ValueCountFrequency (%)
_ 2919
68.1%
, 685
 
16.0%
685
 
16.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 669979
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
n 307531
45.9%
o 161394
24.1%
e 159474
23.8%
t 8187
 
1.2%
r 5303
 
0.8%
h 4773
 
0.7%
i 4238
 
0.6%
b 3094
 
0.5%
_ 2919
 
0.4%
j 2608
 
0.4%
Other values (11) 10458
 
1.6%

travel_countries
Categorical

HIGH CARDINALITY  IMBALANCE 

Distinct1153
Distinct (%)0.7%
Missing0
Missing (%)0.0%
Memory size2.4 MiB
none
151664 
MX
 
1464
JM
 
317
BS
 
283
CA
 
256
Other values (1148)
 
4761

Length

Max length846
Median length4
Mean length3.9837349
Min length2

Characters and Unicode

Total characters632398
Distinct characters31
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique884 ?
Unique (%)0.6%

Sample

1st rownone
2nd rownone
3rd rownone
4th rownone
5th rownone

Common Values

ValueCountFrequency (%)
none 151664
95.5%
MX 1464
 
0.9%
JM 317
 
0.2%
BS 283
 
0.2%
CA 256
 
0.2%
DO 247
 
0.2%
IN 242
 
0.2%
GB 221
 
0.1%
IT 148
 
0.1%
CR 146
 
0.1%
Other values (1143) 3757
 
2.4%

Length

2023-03-29T09:39:32.503126image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
none 151664
93.8%
mx 1819
 
1.1%
bs 448
 
0.3%
jm 437
 
0.3%
gb 417
 
0.3%
ca 409
 
0.3%
it 345
 
0.2%
do 340
 
0.2%
fr 324
 
0.2%
in 287
 
0.2%
Other values (247) 5150
 
3.2%

Most occurring characters

ValueCountFrequency (%)
n 303328
48.0%
o 151664
24.0%
e 151664
24.0%
2895
 
0.5%
, 2895
 
0.5%
M 2574
 
0.4%
X 1864
 
0.3%
B 1356
 
0.2%
R 1208
 
0.2%
A 1173
 
0.2%
Other values (21) 11777
 
1.9%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 606656
95.9%
Uppercase Letter 19952
 
3.2%
Space Separator 2895
 
0.5%
Other Punctuation 2895
 
0.5%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
M 2574
 
12.9%
X 1864
 
9.3%
B 1356
 
6.8%
R 1208
 
6.1%
A 1173
 
5.9%
C 1146
 
5.7%
I 1037
 
5.2%
G 965
 
4.8%
S 934
 
4.7%
E 896
 
4.5%
Other values (16) 6799
34.1%
Lowercase Letter
ValueCountFrequency (%)
n 303328
50.0%
o 151664
25.0%
e 151664
25.0%
Space Separator
ValueCountFrequency (%)
2895
100.0%
Other Punctuation
ValueCountFrequency (%)
, 2895
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 626608
99.1%
Common 5790
 
0.9%

Most frequent character per script

Latin
ValueCountFrequency (%)
n 303328
48.4%
o 151664
24.2%
e 151664
24.2%
M 2574
 
0.4%
X 1864
 
0.3%
B 1356
 
0.2%
R 1208
 
0.2%
A 1173
 
0.2%
C 1146
 
0.2%
I 1037
 
0.2%
Other values (19) 9594
 
1.5%
Common
ValueCountFrequency (%)
2895
50.0%
, 2895
50.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 632398
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
n 303328
48.0%
o 151664
24.0%
e 151664
24.0%
2895
 
0.5%
, 2895
 
0.5%
M 2574
 
0.4%
X 1864
 
0.3%
B 1356
 
0.2%
R 1208
 
0.2%
A 1173
 
0.2%
Other values (21) 11777
 
1.9%

Interactions

2023-03-29T09:39:12.148193image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-03-29T09:38:34.827717image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-03-29T09:38:37.630281image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-03-29T09:38:40.463101image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-03-29T09:38:43.413229image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-03-29T09:38:46.454042image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-03-29T09:38:49.370003image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-03-29T09:38:52.107710image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-03-29T09:38:55.078886image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-03-29T09:38:57.893563image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-03-29T09:39:00.578928image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-03-29T09:39:03.427787image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-03-29T09:39:06.559243image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-03-29T09:39:09.238641image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-03-29T09:39:12.348144image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-03-29T09:38:35.047519image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-03-29T09:38:37.813360image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-03-29T09:38:40.695842image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-03-29T09:38:43.562118image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-03-29T09:38:46.663872image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-03-29T09:38:49.561819image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-03-29T09:38:52.292541image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-03-29T09:38:55.285988image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-03-29T09:38:58.093792image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-03-29T09:39:00.795692image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-03-29T09:39:03.646043image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-03-29T09:39:06.744289image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-03-29T09:39:09.458252image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-03-29T09:39:12.560700image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-03-29T09:38:35.297721image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-03-29T09:38:38.013117image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-03-29T09:38:40.896326image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-03-29T09:38:43.760662image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-03-29T09:38:46.863439image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-03-29T09:38:49.762707image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-03-29T09:38:52.481274image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-03-29T09:38:55.478312image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-03-29T09:38:58.296940image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-03-29T09:39:01.044094image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-03-29T09:39:03.862872image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-03-29T09:39:06.910110image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-03-29T09:39:09.745405image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-03-29T09:39:12.708433image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-03-29T09:38:35.514255image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-03-29T09:38:38.219522image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-03-29T09:38:41.062042image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-03-29T09:38:44.013994image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-03-29T09:38:47.059143image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-03-29T09:38:49.912788image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-03-29T09:38:52.694072image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-03-29T09:38:55.678488image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-03-29T09:38:58.558760image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-03-29T09:39:01.229272image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-03-29T09:39:04.077271image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-03-29T09:39:07.095628image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-03-29T09:39:09.916049image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-03-29T09:39:12.849039image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-03-29T09:38:35.729391image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-03-29T09:38:38.394871image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-03-29T09:38:41.292260image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-03-29T09:38:44.274956image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-03-29T09:38:47.312640image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-03-29T09:38:50.127907image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-03-29T09:38:52.874860image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-03-29T09:38:55.845863image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-03-29T09:38:58.760893image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-03-29T09:39:01.409333image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-03-29T09:39:04.222492image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-03-29T09:39:07.319228image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-03-29T09:39:10.117808image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-03-29T09:39:12.992709image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-03-29T09:38:35.945573image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-03-29T09:38:38.649948image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-03-29T09:38:41.544244image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-03-29T09:38:44.502544image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-03-29T09:38:47.511447image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-03-29T09:38:50.296167image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-03-29T09:38:53.045985image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-03-29T09:38:56.027626image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-03-29T09:38:58.927474image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-03-29T09:39:01.577919image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-03-29T09:39:04.690763image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-03-29T09:39:07.527617image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-03-29T09:39:10.285600image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-03-29T09:39:13.212429image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-03-29T09:38:36.138119image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-03-29T09:38:38.814372image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-03-29T09:38:41.826404image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-03-29T09:38:44.713022image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-03-29T09:38:47.740680image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-03-29T09:38:50.516567image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-03-29T09:38:53.310511image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-03-29T09:38:56.293038image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-03-29T09:38:59.062281image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-03-29T09:39:01.792920image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-03-29T09:39:04.911234image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-03-29T09:39:07.759452image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-03-29T09:39:10.476600image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-03-29T09:39:13.450454image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-03-29T09:38:36.280333image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-03-29T09:38:39.124074image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-03-29T09:38:42.044778image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-03-29T09:38:44.861806image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-03-29T09:38:47.927967image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-03-29T09:38:50.746786image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-03-29T09:38:53.462690image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-03-29T09:38:56.476494image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-03-29T09:38:59.292259image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-03-29T09:39:01.980217image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-03-29T09:39:05.095314image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-03-29T09:39:07.945748image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-03-29T09:39:10.693966image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-03-29T09:39:13.642869image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-03-29T09:38:36.429945image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-03-29T09:38:39.314513image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-03-29T09:38:42.297424image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-03-29T09:38:45.113477image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-03-29T09:38:48.238164image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-03-29T09:38:50.979320image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-03-29T09:38:53.718759image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-03-29T09:38:56.646344image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-03-29T09:38:59.436412image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-03-29T09:39:02.176765image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-03-29T09:39:05.276697image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-03-29T09:39:08.128726image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-03-29T09:39:10.873480image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-03-29T09:39:13.828432image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-03-29T09:38:36.629847image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-03-29T09:38:39.526743image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-03-29T09:38:42.549411image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-03-29T09:38:45.295253image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-03-29T09:38:48.412879image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-03-29T09:38:51.164305image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-03-29T09:38:53.928260image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-03-29T09:38:56.843524image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-03-29T09:38:59.594669image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-03-29T09:39:02.377366image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-03-29T09:39:05.544969image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-03-29T09:39:08.288823image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-03-29T09:39:11.044804image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-03-29T09:39:14.025004image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-03-29T09:38:36.791119image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-03-29T09:38:39.680728image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-03-29T09:38:42.743141image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-03-29T09:38:45.527539image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-03-29T09:38:48.610487image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-03-29T09:38:51.350389image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-03-29T09:38:54.160707image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-03-29T09:38:57.047328image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-03-29T09:38:59.845582image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-03-29T09:39:02.570133image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-03-29T09:39:05.772510image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-03-29T09:39:08.445336image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-03-29T09:39:11.259943image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-03-29T09:39:14.168619image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-03-29T09:38:36.995246image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-03-29T09:38:39.845417image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-03-29T09:38:42.898880image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-03-29T09:38:45.710357image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-03-29T09:38:48.758198image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-03-29T09:38:51.546019image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-03-29T09:38:54.521162image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-03-29T09:38:57.279219image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-03-29T09:39:00.030608image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-03-29T09:39:02.769399image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-03-29T09:39:05.928730image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-03-29T09:39:08.611976image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-03-29T09:39:11.461701image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-03-29T09:39:14.328373image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-03-29T09:38:37.228627image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-03-29T09:38:40.029816image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-03-29T09:38:43.062117image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-03-29T09:38:45.895341image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-03-29T09:38:48.946078image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-03-29T09:38:51.746116image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-03-29T09:38:54.679129image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-03-29T09:38:57.445820image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-03-29T09:39:00.178719image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-03-29T09:39:02.978781image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-03-29T09:39:06.109764image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-03-29T09:39:08.828021image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-03-29T09:39:11.700099image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-03-29T09:39:14.508169image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-03-29T09:38:37.390090image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-03-29T09:38:40.243413image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-03-29T09:38:43.213123image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-03-29T09:38:46.225307image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-03-29T09:38:49.144835image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-03-29T09:38:51.948622image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-03-29T09:38:54.862237image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-03-29T09:38:57.691311image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-03-29T09:39:00.361439image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-03-29T09:39:03.176728image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-03-29T09:39:06.339795image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-03-29T09:39:09.011909image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2023-03-29T09:39:11.908204image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Correlations

2023-03-29T09:39:32.794655image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
ageheightweightbmi_app_stateincome_app_statecurrent_ins_valuediabetes_agealcohol_weeklymarijuana_monthly_countdiabetes_hemoglobinstroke_countdui_countweight_loss_amountskydive_countgenderstateapproved_risk_classcurrent_insreplacement_insmarijuanavalid_drivers_license_app_statehiv_poscovidprevious_declinedprevious_decline_reasonchestpain_diagnosisdiabetes_complicationsdiabetes_gestationaldiabetes_hospitalizationemployment_statusfamily_historyfinal_expenseinpatientmed_adviceseizure_diagnosisstroke_diagnosisstroke_diagnosis_multiselecttest_proc_outstandingclimbing_equipmentcriminal_historydisability_audio_visualdisability_pmts_reasonexpected_travel_90_daysexpected_travelexpected_travel_multiselectillicit_drugsmental_health_hospitalizedmental_health_missed_workpilot_student_privateracing_100mphrx_increasescuba_130ftseizure_car_accidenttbweight_losscancer_typecitizenlegal_resident
age1.0000.0590.1250.1130.1670.1820.1460.032-0.1390.1430.043-0.043-0.049-0.0790.0580.0180.0640.1820.0880.1420.0870.0130.0170.0150.0110.0120.0190.0250.0250.0980.0100.0120.0110.0290.0050.0040.0240.0090.0470.0490.0040.0560.0100.0320.0000.0860.0250.0070.0230.0430.0140.0430.0070.0970.0520.0240.0430.041
height0.0591.0000.5760.0160.2310.0610.0450.1130.0270.038-0.0000.028-0.0120.0010.7060.0180.0270.0540.0270.0240.0140.0250.0120.0150.0000.0140.0060.0570.0090.0640.0000.0150.0060.0130.0000.0000.0090.0090.0000.0170.0000.0270.0000.0180.0000.0040.0150.0050.0170.0000.0000.0030.0000.0000.0190.0040.0390.017
weight0.1250.5761.0000.8070.1310.0500.1300.044-0.0340.0790.005-0.0080.127-0.0300.4130.0230.0690.0480.0120.0350.0190.0170.0000.0390.0130.0080.0080.0130.0100.0550.0000.0040.0060.0120.0000.0040.0030.0110.0180.0100.0000.0200.0050.0120.0020.0330.0120.0040.0050.0070.0050.0090.0000.0290.1340.0020.0610.022
bmi_app_state0.1130.0160.8071.000-0.0020.0180.131-0.025-0.0560.0750.008-0.0280.167-0.0360.1460.0260.1430.0410.0200.0660.0180.0150.0170.0430.0160.0130.0090.0480.0150.0390.0010.0120.0080.0200.0050.0000.0070.0140.0220.0180.0060.0240.0070.0300.0070.0450.0140.0000.0100.0140.0030.0190.0040.0580.1730.0040.0540.026
income_app_state0.1670.2310.131-0.0021.0000.223-0.0120.152-0.1380.036-0.018-0.031-0.062-0.0380.0710.0140.0270.0940.0260.0290.0110.0000.0070.0000.0000.0000.0000.0100.0000.0310.0000.0250.0070.0050.0170.0000.0030.0000.0000.0060.0000.0000.0100.0460.0000.0200.0050.0000.0180.0040.0000.0070.0000.0340.0190.0080.0190.015
current_ins_value0.1820.0610.0500.0180.2231.0000.0280.064-0.0670.0420.005-0.028-0.009-0.0180.0030.0000.0000.0300.0180.0000.0000.0000.0260.0140.0510.0000.0000.0000.0000.0080.0190.0000.0910.0040.0000.0000.0000.0080.0000.0310.0000.0000.0000.0000.0000.0130.0000.0000.0390.0140.0000.0430.0000.0070.0040.0000.0000.000
diabetes_age0.1460.0450.1300.131-0.0120.0281.000-0.048-0.0250.6850.027-0.0200.061-0.0140.0700.0120.0890.0460.0170.0310.0210.0000.0330.0560.0260.0340.0820.0180.1700.0290.0090.0270.0100.0260.0130.0150.0200.0100.0050.0100.0300.0440.0000.0120.0000.0180.0060.0000.0000.0030.0000.0030.0000.0260.0630.0060.0160.013
alcohol_weekly0.0320.1130.044-0.0250.1520.064-0.0481.0000.085-0.027-0.0030.042-0.0070.0170.1080.0180.0270.0300.0190.0810.0110.0050.0250.0180.0090.0250.0000.0120.0000.0270.0110.0160.0050.0120.0250.0000.0000.0210.0170.0200.0000.0000.0000.0530.0000.0480.0000.0000.0080.0070.0000.0370.0000.0920.0020.0080.0050.001
marijuana_monthly_count-0.1390.027-0.034-0.056-0.138-0.067-0.0250.0851.000-0.037-0.0090.0550.0620.0790.0240.0300.0840.0500.0210.6120.0140.0210.0410.0120.0100.0570.0000.0070.0000.0330.0140.0310.0140.0210.0320.0090.0000.0250.0410.0540.0080.0260.0130.0120.0000.1590.0280.0140.0070.0380.0000.0520.0000.2510.0550.0000.0350.012
diabetes_hemoglobin0.1430.0380.0790.0750.0360.0420.685-0.027-0.0371.0000.021-0.0170.028-0.0120.0010.0000.0000.0070.0000.0000.0000.0000.0000.0200.0220.0000.0000.0000.0000.0000.0000.0000.0000.0000.0000.0000.0000.0000.0000.0000.0000.0000.0000.0000.0000.0000.0000.0000.0000.0000.0000.0000.0000.0000.0000.0000.0000.000
stroke_count0.043-0.0000.0050.008-0.0180.0050.027-0.003-0.0090.0211.000-0.0040.0090.0020.0000.0000.0010.0000.0120.0000.0000.0000.0230.0000.0000.0330.0140.0000.0690.0080.0000.0410.0360.0050.0170.0000.2190.0070.0320.0000.0000.0000.0000.0000.0000.0000.0050.0000.0440.0410.0000.0580.0000.0000.0050.0000.0000.000
dui_count-0.0430.028-0.008-0.028-0.031-0.028-0.0200.0420.055-0.017-0.0041.0000.0050.0240.0170.0120.0150.0080.0020.0160.0000.0130.0320.0040.0000.0250.0000.0000.0000.0080.0000.0600.0140.0120.0080.0000.0000.0170.0090.1120.0160.1330.0000.0000.0000.0460.0000.0000.0030.0290.0000.0420.0000.0380.0080.0000.0120.000
weight_loss_amount-0.049-0.0120.1270.167-0.062-0.0090.061-0.0070.0620.0280.0090.0051.0000.0360.0020.0080.0000.0040.0000.0060.0000.0000.0000.0000.0000.0000.0000.0000.0000.0060.0000.0000.0180.0060.1110.0000.0000.0360.0000.0000.0000.0000.0000.0020.0000.0120.0000.0000.0000.0000.0000.0000.0000.0000.0180.0030.0000.000
skydive_count-0.0790.001-0.030-0.036-0.038-0.018-0.0140.0170.079-0.0120.0020.0240.0361.0000.0030.0000.0000.0000.0000.0000.0000.0000.0290.0170.0210.0000.0000.0000.0000.0000.0000.2040.1140.0090.0000.0000.0000.0360.0320.0000.0000.0000.0000.0200.0000.0070.0000.0000.0440.0000.0000.0740.0000.0090.0060.0000.0000.000
gender0.0580.7060.4130.1460.0710.0030.0700.1080.0240.0010.0000.0170.0020.0031.0000.0380.0780.0560.0280.0360.0050.0470.0010.0200.0270.0130.0150.0770.0160.2200.0050.0110.0140.0420.0220.0000.0100.0100.0070.0830.0070.0970.0080.0340.0080.0090.0180.0080.0260.0150.0000.0100.0040.0150.0470.0310.0500.048
state0.0180.0180.0230.0260.0140.0000.0120.0180.0300.0000.0000.0120.0080.0000.0381.0000.0660.0400.0720.1150.0660.0350.0340.0250.0000.0070.0070.0030.0000.0790.0440.0090.0000.0290.0130.0130.0000.0200.0360.0280.0130.0190.0250.1080.0240.0640.0150.0170.0160.0160.0100.0080.0260.1150.0350.0000.1060.039
approved_risk_class0.0640.0270.0690.1430.0270.0000.0890.0270.0840.0000.0010.0150.0000.0000.0780.0661.0000.0710.0460.1730.0970.0890.0870.1260.0640.0250.0200.0430.0300.1370.0210.0460.0290.0810.0270.0090.0230.0580.0140.1160.0220.1180.0080.0390.0060.1620.0490.0180.0320.0190.0210.0250.0000.3170.0860.0560.0590.052
current_ins0.1820.0540.0480.0410.0940.0300.0460.0300.0500.0070.0000.0080.0040.0000.0560.0400.0711.0000.3180.0630.0280.0180.0020.0220.0290.0000.0090.0070.0000.1270.0060.0100.0100.0610.0000.0000.0050.0160.0090.0790.0000.0470.0130.0480.0000.0530.0060.0000.0100.0080.0050.0000.0000.0760.0060.0310.0220.019
replacement_ins0.0880.0270.0120.0200.0260.0180.0170.0190.0210.0000.0120.0020.0000.0000.0280.0720.0460.3181.0000.0260.0110.0080.0020.0060.0170.0000.0000.0030.0000.0360.0070.0000.0140.0280.0000.0000.0000.0050.0000.0330.0000.0000.0040.0200.0000.0210.0000.0000.0080.0000.0000.0000.0000.0360.0060.0000.0140.010
marijuana0.1420.0240.0350.0660.0290.0000.0310.0810.6120.0000.0000.0160.0060.0000.0360.1150.1730.0630.0261.0000.0070.0310.0420.0180.0310.0460.0060.0020.0020.1080.0110.0500.0270.0520.0400.0040.0060.0330.0390.1850.0020.0480.0110.0300.0050.1940.0390.0130.0050.0360.0000.0410.0020.2770.0610.0100.0470.028
valid_drivers_license_app_state0.0870.0140.0190.0180.0110.0000.0210.0110.0140.0000.0000.0000.0000.0000.0050.0660.0970.0280.0110.0071.0000.0100.0090.0110.0090.0000.0000.0030.0010.0500.0050.0000.0000.0240.0180.0000.0000.0070.0000.0240.0000.0170.0000.0000.0000.0180.0050.0000.0020.0000.0000.0000.0000.0230.0110.0110.0290.007
hiv_pos0.0130.0250.0170.0150.0000.0000.0000.0050.0210.0000.0000.0130.0000.0000.0470.0350.0890.0180.0080.0310.0101.0000.0330.0300.0330.0310.0000.0000.0020.0490.0080.0510.0460.0180.0260.0000.0000.0150.0000.0250.0080.0520.0010.0030.0230.0380.0000.0000.0090.0000.0000.0130.0000.0520.0040.0250.0050.002
covid0.0170.0120.0000.0170.0070.0260.0330.0250.0410.0000.0230.0320.0000.0290.0010.0340.0870.0020.0020.0420.0090.0331.0000.0340.0410.0950.0070.0110.0120.0590.0080.1360.1080.0520.0370.0000.0150.0310.0230.0580.0260.0810.0010.0240.0000.0580.0080.0000.0180.0310.0000.0370.0000.0740.0490.0250.0010.004
previous_declined0.0150.0150.0390.0430.0000.0140.0560.0180.0120.0200.0000.0040.0000.0170.0200.0250.1260.0220.0060.0180.0110.0300.0341.0000.9770.0530.0240.0150.0060.0520.0130.0490.0210.0750.0250.0000.0210.0270.0000.0200.0070.0680.0060.0130.0000.0190.0080.0020.0040.0010.0030.0150.0000.0350.0240.0320.0150.006
previous_decline_reason0.0110.0000.0130.0160.0000.0510.0260.0090.0100.0220.0000.0000.0000.0210.0270.0000.0640.0290.0170.0310.0090.0330.0410.9771.0000.1000.0000.0190.0000.0150.0210.0240.0220.0380.0270.0000.0000.0330.0070.0560.0160.0080.0170.0120.0180.0460.0070.0650.0430.0090.0000.0610.1050.0460.0280.0410.0160.000
chestpain_diagnosis0.0120.0140.0080.0130.0000.0000.0340.0250.0570.0000.0330.0250.0000.0000.0130.0070.0250.0000.0000.0460.0000.0310.0950.0530.1001.0000.0000.0560.0000.0320.0000.2240.1190.0790.1170.0000.1050.0870.0730.0390.0660.0670.0000.0290.0000.0600.0710.0000.1060.0770.0000.1440.0000.0750.0520.1380.0000.000
diabetes_complications0.0190.0060.0080.0090.0000.0000.0820.0000.0000.0000.0140.0000.0000.0000.0150.0070.0200.0090.0000.0060.0000.0000.0070.0240.0000.0001.0000.0000.0000.0230.0000.0390.0160.0280.0000.0000.1150.0130.0000.0000.0000.0630.0000.0000.0000.0000.0020.0000.0000.0000.0000.0000.0000.0180.0200.0040.0000.000
diabetes_gestational0.0250.0570.0130.0480.0100.0000.0180.0120.0070.0000.0000.0000.0000.0000.0770.0030.0430.0070.0030.0020.0030.0000.0110.0150.0190.0560.0001.0000.0000.0310.0000.0370.0320.0170.0360.0000.0150.0000.0000.0120.0020.0520.0000.0030.0000.0070.0000.0000.0000.0000.0000.0060.0060.0090.0240.0160.0000.000
diabetes_hospitalization0.0250.0090.0100.0150.0000.0000.1700.0000.0000.0000.0690.0000.0000.0000.0160.0000.0300.0000.0000.0020.0010.0020.0120.0060.0000.0000.0000.0001.0000.0330.0000.0610.0440.0300.0000.0000.0260.0010.0000.0000.0000.0300.0000.0000.0000.0000.0070.0000.0000.0000.0000.0000.0000.0140.0140.0000.0000.003
employment_status0.0980.0640.0550.0390.0310.0080.0290.0270.0330.0000.0080.0080.0060.0000.2200.0790.1370.1270.0360.1080.0500.0490.0590.0520.0150.0320.0230.0310.0331.0000.2320.0650.0280.0680.0270.0530.0270.0440.0140.0500.0440.2090.0980.0490.0000.1330.0460.0940.0110.0180.0090.0180.0170.2170.0820.0250.0600.031
family_history0.0100.0000.0000.0010.0000.0190.0090.0110.0140.0000.0000.0000.0000.0000.0050.0440.0210.0060.0070.0110.0050.0080.0080.0130.0210.0000.0000.0000.0000.2321.0000.0000.0000.0170.0000.0380.0000.0020.0000.0160.0000.0000.0530.0020.0000.0130.0260.0880.0000.0080.0000.0060.0460.0310.0130.0000.0000.005
final_expense0.0120.0150.0040.0120.0250.0000.0270.0160.0310.0000.0410.0600.0000.2040.0110.0090.0460.0100.0000.0500.0000.0510.1360.0490.0240.2240.0390.0370.0610.0650.0001.0000.2360.0990.0800.0000.0560.0920.0270.0270.0400.2130.0000.0300.0000.0720.0160.0000.1240.0290.0000.1690.0000.0740.0560.1140.0100.000
inpatient0.0110.0060.0060.0080.0070.0910.0100.0050.0140.0000.0360.0140.0180.1140.0140.0000.0290.0100.0140.0270.0000.0460.1080.0210.0220.1190.0160.0320.0440.0280.0000.2361.0000.0390.1140.0000.0220.0510.0300.0170.0500.2430.0000.0210.0400.0390.0000.0000.0700.0520.0000.0740.0000.0540.0390.0730.0100.009
med_advice0.0290.0130.0120.0200.0050.0040.0260.0120.0210.0000.0050.0120.0060.0090.0420.0290.0810.0610.0280.0520.0240.0180.0520.0750.0380.0790.0280.0170.0300.0680.0170.0990.0391.0000.0320.0000.0230.4150.0200.0140.0090.1020.0060.0300.0520.0400.0130.0050.0090.0110.0000.0150.0000.0750.0890.0810.0210.008
seizure_diagnosis0.0050.0000.0000.0050.0170.0000.0130.0250.0320.0000.0170.0080.1110.0000.0220.0130.0270.0000.0000.0400.0180.0260.0370.0250.0270.1170.0000.0360.0000.0270.0000.0800.1140.0321.0000.0000.0000.0430.0370.0520.0190.0230.0000.0000.0000.0550.0320.0000.0620.0270.0000.0830.4470.0590.0260.1410.0000.000
stroke_diagnosis0.0040.0000.0040.0000.0000.0000.0150.0000.0090.0000.0000.0000.0000.0000.0000.0130.0090.0000.0000.0040.0000.0000.0000.0000.0000.0000.0000.0000.0000.0530.0380.0000.0000.0000.0001.0000.0000.0180.0000.0060.0000.1210.0420.0000.0000.0090.0220.0270.0110.0170.0000.0000.0000.0070.0000.0000.0000.000
stroke_diagnosis_multiselect0.0240.0090.0030.0070.0030.0000.0200.0000.0000.0000.2190.0000.0000.0000.0100.0000.0230.0050.0000.0060.0000.0000.0150.0210.0000.1050.1150.0150.0260.0270.0000.0560.0220.0230.0000.0001.0000.0080.0090.0000.0140.0310.0000.0000.0000.0000.0040.0000.0140.0120.0000.0190.0000.0230.0120.0090.0040.000
test_proc_outstanding0.0090.0090.0110.0140.0000.0080.0100.0210.0250.0000.0070.0170.0360.0360.0100.0200.0580.0160.0050.0330.0070.0150.0310.0270.0330.0870.0130.0000.0010.0440.0020.0920.0510.4150.0430.0180.0081.0000.0080.0210.0040.0620.0130.0150.0000.0300.0060.0100.0060.0140.0000.0190.0050.0410.0250.0430.0000.002
climbing_equipment0.0470.0000.0180.0220.0000.0000.0050.0170.0410.0000.0320.0090.0000.0320.0070.0360.0140.0090.0000.0390.0000.0000.0230.0000.0070.0730.0000.0000.0000.0140.0000.0270.0300.0200.0370.0000.0090.0081.0000.0380.0020.0250.0090.0350.0000.0450.0000.0060.0250.1240.0000.0150.0000.0310.0200.0120.0000.000
criminal_history0.0490.0170.0100.0180.0060.0310.0100.0200.0540.0000.0000.1120.0000.0000.0830.0280.1160.0790.0330.1850.0240.0250.0580.0200.0560.0390.0000.0120.0000.0500.0160.0270.0170.0140.0520.0060.0000.0210.0381.0000.0100.0470.0120.0200.0000.3310.0470.0260.0130.0450.0000.0550.0150.2690.0270.0130.0460.018
disability_audio_visual0.0040.0000.0000.0060.0000.0000.0300.0000.0080.0000.0000.0160.0000.0000.0070.0130.0220.0000.0000.0020.0000.0080.0260.0070.0160.0660.0000.0020.0000.0440.0000.0400.0500.0090.0190.0000.0140.0040.0020.0101.0000.1550.0000.0050.0000.0060.0000.0000.0100.0000.0000.0000.0000.0150.0090.0000.0050.000
disability_pmts_reason0.0560.0270.0200.0240.0000.0000.0440.0000.0260.0000.0000.1330.0000.0000.0970.0190.1180.0470.0000.0480.0170.0520.0810.0680.0080.0670.0630.0520.0300.2090.0000.2130.2430.1020.0230.1210.0310.0620.0250.0470.1551.0000.0190.0390.0000.0480.0550.0540.0600.0000.0000.0120.0360.1080.0600.0940.0540.022
expected_travel_90_days0.0100.0000.0050.0070.0100.0000.0000.0000.0130.0000.0000.0000.0000.0000.0080.0250.0080.0130.0040.0110.0000.0010.0010.0060.0170.0000.0000.0000.0000.0980.0530.0000.0000.0060.0000.0420.0000.0130.0090.0120.0000.0191.0000.1870.0000.0040.0210.0280.0190.0170.0000.0270.0000.0150.0060.0000.0180.013
expected_travel0.0320.0180.0120.0300.0460.0000.0120.0530.0120.0000.0000.0000.0020.0200.0340.1080.0390.0480.0200.0300.0000.0030.0240.0130.0120.0290.0000.0030.0000.0490.0020.0300.0210.0300.0000.0000.0000.0150.0350.0200.0050.0390.1871.0000.0000.0000.0000.0000.0280.0300.0000.0370.0000.0120.0240.0110.0670.036
expected_travel_multiselect0.0000.0000.0020.0070.0000.0000.0000.0000.0000.0000.0000.0000.0000.0000.0080.0240.0060.0000.0000.0050.0000.0230.0000.0000.0180.0000.0000.0000.0000.0000.0000.0000.0400.0520.0000.0000.0000.0000.0000.0000.0000.0000.0000.0001.0000.0000.0000.0000.0000.0000.0000.0000.0000.0050.0030.0000.0130.012
illicit_drugs0.0860.0040.0330.0450.0200.0130.0180.0480.1590.0000.0000.0460.0120.0070.0090.0640.1620.0530.0210.1940.0180.0380.0580.0190.0460.0600.0000.0070.0000.1330.0130.0720.0390.0400.0550.0090.0000.0300.0450.3310.0060.0480.0040.0000.0001.0000.0460.0090.0120.0380.0040.0510.0130.2920.0470.0060.0270.014
mental_health_hospitalized0.0250.0150.0120.0140.0050.0000.0060.0000.0280.0000.0050.0000.0000.0000.0180.0150.0490.0060.0000.0390.0050.0000.0080.0080.0070.0710.0020.0000.0070.0460.0260.0160.0000.0130.0320.0220.0040.0060.0000.0470.0000.0550.0210.0000.0000.0461.0000.0000.0000.0000.0000.0000.0280.0350.0180.0000.0080.005
mental_health_missed_work0.0070.0050.0040.0000.0000.0000.0000.0000.0140.0000.0000.0000.0000.0000.0080.0170.0180.0000.0000.0130.0000.0000.0000.0020.0650.0000.0000.0000.0000.0940.0880.0000.0000.0050.0000.0270.0000.0100.0060.0260.0000.0540.0280.0000.0000.0090.0001.0000.0000.0000.0000.0000.0240.0080.0000.0000.0000.000
pilot_student_private0.0230.0170.0050.0100.0180.0390.0000.0080.0070.0000.0440.0030.0000.0440.0260.0160.0320.0100.0080.0050.0020.0090.0180.0040.0430.1060.0000.0000.0000.0110.0000.1240.0700.0090.0620.0110.0140.0060.0250.0130.0100.0600.0190.0280.0000.0120.0000.0001.0000.0150.0000.1000.0000.0100.0050.0450.0030.000
racing_100mph0.0430.0000.0070.0140.0040.0140.0030.0070.0380.0000.0410.0290.0000.0000.0150.0160.0190.0080.0000.0360.0000.0000.0310.0010.0090.0770.0000.0000.0000.0180.0080.0290.0520.0110.0270.0170.0120.0140.1240.0450.0000.0000.0170.0300.0000.0380.0000.0000.0151.0000.0000.0300.0000.0360.0180.0000.0040.000
rx_increase0.0140.0000.0050.0030.0000.0000.0000.0000.0000.0000.0000.0000.0000.0000.0000.0100.0210.0050.0000.0000.0000.0000.0000.0030.0000.0000.0000.0000.0000.0090.0000.0000.0000.0000.0000.0000.0000.0000.0000.0000.0000.0000.0000.0000.0000.0040.0000.0000.0000.0001.0000.0000.0000.0080.0000.0000.0090.000
scuba_130ft0.0430.0030.0090.0190.0070.0430.0030.0370.0520.0000.0580.0420.0000.0740.0100.0080.0250.0000.0000.0410.0000.0130.0370.0150.0610.1440.0000.0060.0000.0180.0060.1690.0740.0150.0830.0000.0190.0190.0150.0550.0000.0120.0270.0370.0000.0510.0000.0000.1000.0300.0001.0000.0000.0420.0270.0570.0000.000
seizure_car_accident0.0070.0000.0000.0040.0000.0000.0000.0000.0000.0000.0000.0000.0000.0000.0040.0260.0000.0000.0000.0020.0000.0000.0000.0000.1050.0000.0000.0060.0000.0170.0460.0000.0000.0000.4470.0000.0000.0050.0000.0150.0000.0360.0000.0000.0000.0130.0280.0240.0000.0000.0000.0001.0000.0000.0000.0000.0000.000
tb0.0970.0000.0290.0580.0340.0070.0260.0920.2510.0000.0000.0380.0000.0090.0150.1150.3170.0760.0360.2770.0230.0520.0740.0350.0460.0750.0180.0090.0140.2170.0310.0740.0540.0750.0590.0070.0230.0410.0310.2690.0150.1080.0150.0120.0050.2920.0350.0080.0100.0360.0080.0420.0001.0000.0770.0190.0440.018
weight_loss0.0520.0190.1340.1730.0190.0040.0630.0020.0550.0000.0050.0080.0180.0060.0470.0350.0860.0060.0060.0610.0110.0040.0490.0240.0280.0520.0200.0240.0140.0820.0130.0560.0390.0890.0260.0000.0120.0250.0200.0270.0090.0600.0060.0240.0030.0470.0180.0000.0050.0180.0000.0270.0000.0771.0000.0210.0330.019
cancer_type0.0240.0040.0020.0040.0080.0000.0060.0080.0000.0000.0000.0000.0030.0000.0310.0000.0560.0310.0000.0100.0110.0250.0250.0320.0410.1380.0040.0160.0000.0250.0000.1140.0730.0810.1410.0000.0090.0430.0120.0130.0000.0940.0000.0110.0000.0060.0000.0000.0450.0000.0000.0570.0000.0190.0211.0000.0100.003
citizen0.0430.0390.0610.0540.0190.0000.0160.0050.0350.0000.0000.0120.0000.0000.0500.1060.0590.0220.0140.0470.0290.0050.0010.0150.0160.0000.0000.0000.0000.0600.0000.0100.0100.0210.0000.0000.0040.0000.0000.0460.0050.0540.0180.0670.0130.0270.0080.0000.0030.0040.0090.0000.0000.0440.0330.0101.0000.518
legal_resident0.0410.0170.0220.0260.0150.0000.0130.0010.0120.0000.0000.0000.0000.0000.0480.0390.0520.0190.0100.0280.0070.0020.0040.0060.0000.0000.0000.0000.0030.0310.0050.0000.0090.0080.0000.0000.0000.0020.0000.0180.0000.0220.0130.0360.0120.0140.0050.0000.0000.0000.0000.0000.0000.0180.0190.0030.5181.000

Missing values

2023-03-29T09:39:15.194309image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
A simple visualization of nullity by column.
2023-03-29T09:39:16.808876image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

agegenderheightweightstateapproved_risk_classbmi_app_stateincome_app_statecurrent_inscurrent_ins_valuereplacement_insdiabetes_agealcohol_weeklymarijuana_monthly_countmarijuanaoccupation_descriptionrisky_activitiesvalid_drivers_license_app_statehiv_poscovidprevious_declinedprevious_decline_reasonchestpain_diagnosisdiabetes_complicationsdiabetes_gestationaldiabetes_hemoglobindiabetes_hospitalizationemployment_statusfamily_historyfinal_expenseinpatientmed_advicemed_conditionsseizure_diagnosisstroke_countstroke_diagnosisstroke_diagnosis_multiselecttest_proc_outstandingtest_proc_typeclimbing_equipmentcriminal_historydisability_audio_visualdisability_pmts_reasondui_countexpected_travel_90_daysexpected_travelexpected_travel_multiselectillicit_drugsmental_health_diagnosismental_health_hospitalizedmental_health_missed_workpilot_student_privateracing_100mphrx_increasescuba_130ftseizure_car_accidenttbweight_lossweight_loss_amountweight_loss_reasoncancer_typecitizenlegal_residentskydive_countsurgery_typetravel_countries
048.534878male68.0195.0TXdeclined29.64641036400no0.0no-1.00.00.0FalseBuy and sell onlinenonetruefalsefalsefalsenonenonenoneFalse-1.0noparttimenononenonenonenonenone0.0nonenoneFalsenoneTruefelonyfalsenone0.0nonenononeFalsenonefalsefalseFalseTrueFalsefalsefalsefalseFalse0.0nonenoneyesresident0.0nonenone
142.716825male67.0200.0NMdeclined31.3210079528no0.0no-1.00.00.0FalseothernonetruefalsefalsefalsenonenonenoneFalse-1.0nounemployednononenonenonenonenone0.0nonenoneFalsenoneTruemisdemeanor, felonyfalsessdi0.0nonenononeTruenonefalsefalseFalseTrueFalsefalsefalsefalseFalse0.0nonenoneyesresident0.0nonenone
254.399474male66.0193.0KSdeclined31.1476129396no0.0no-1.00.00.0FalseothernonetruefalsefalsefalsenonenonenoneFalse-1.0nostay_at_homenononenonetest_procedureliver_cirrhosisnone0.0nonenoneFalsenoneTruenonefalsessdi0.0nonenononeFalsenonefalsefalseFalseTrueFalsefalsefalsetrueFalse0.0nonenoneyesresident0.0nonenone
350.336420female63.0140.0OKdeclined24.79717882000no0.0no-1.00.020.0TrueothernonetruefalsefalsefalsenonenonenoneFalse-1.0nootheryesnonenonenonenonenone0.0nonenoneFalsenoneTruenonefalsenone0.0nonenononeFalsedepression, anxiety, addfalsefalseFalseTrueFalsefalsefalsefalseTrue20.0medicationnoneyesresident0.0nonenone
442.725039female61.0132.0INdeclined24.93845772000no0.0no-1.00.00.0FalseothernonetruefalsefalsefalsenonenonenoneFalse-1.0nofulltime_studentnononenonetest_proceduredepressionnone0.0nonenoneTruemammogram, ekgTruenonefalsessdi0.0nonenononeFalsebipolarfalsefalseFalseTrueFalsefalsefalsefalseFalse0.0nonenoneyesresident0.0nonenone
548.868902female64.0129.0INdeclined22.140381576000no0.0no-1.00.00.0FalseothernonetruefalsefalsefalsenonenonenoneFalse-1.0nofulltime_studentnononenonenonenonenone0.0nonenoneFalsenoneTruenonefalsenone0.0nonenononeFalsenonefalsefalseFalseTrueFalsefalsefalsefalseFalse0.0nonenoneyesresident0.0nonenone
629.207992female68.0165.0SCdeclined25.08542410400no0.0no-1.00.00.0FalseBabysitternonetruefalsefalsefalsenonenonenoneFalse-1.0noparttimenononenonenonesickle_cellnone0.0nonenoneFalsenoneTruenonefalseother0.0nonenononeFalsenonefalsefalseFalseTrueFalsefalsefalsefalseFalse0.0nonenoneyesresident0.0nonenone
757.422124male75.0235.0FLdeclined29.369778100000no0.0no-1.06.00.0FalseRestaurant ownernonetruefalsefalsefalsenonenonenoneFalse-1.0nofulltimenononenonenonenonenone0.0nonenoneFalsenoneTruenonefalsenone0.0nonenoextended_travelFalsenonefalsefalseFalseTrueFalsefalsefalsefalseFalse0.0nonenonenonone0.0nonenone
838.029528female61.0110.0MAelite_nt20.78204867000no0.0no-1.00.00.0FalseothernonetruefalsefalsefalsenonenonenoneFalse-1.0noothernononenonenonenonenone0.0nonenoneFalsenoneTruenonefalsenone0.0nonenononeFalsenonefalsefalseFalseTrueFalsefalsefalsefalseFalse0.0nonenoneyesresident0.0nonenone
938.065121male69.0186.0CAdeclined27.46439885200no0.0no-1.00.00.0FalseothernonetruefalsefalsefalsenonenonenoneFalse-1.0noothernononenonenonenonenone0.0nonenoneFalsenoneTruenonefalsenone0.0yesyesnoneFalseanxietyfalsefalseFalseTrueFalsefalsefalsefalseFalse0.0nonenoneyesresident0.0nonenone
agegenderheightweightstateapproved_risk_classbmi_app_stateincome_app_statecurrent_inscurrent_ins_valuereplacement_insdiabetes_agealcohol_weeklymarijuana_monthly_countmarijuanaoccupation_descriptionrisky_activitiesvalid_drivers_license_app_statehiv_poscovidprevious_declinedprevious_decline_reasonchestpain_diagnosisdiabetes_complicationsdiabetes_gestationaldiabetes_hemoglobindiabetes_hospitalizationemployment_statusfamily_historyfinal_expenseinpatientmed_advicemed_conditionsseizure_diagnosisstroke_countstroke_diagnosisstroke_diagnosis_multiselecttest_proc_outstandingtest_proc_typeclimbing_equipmentcriminal_historydisability_audio_visualdisability_pmts_reasondui_countexpected_travel_90_daysexpected_travelexpected_travel_multiselectillicit_drugsmental_health_diagnosismental_health_hospitalizedmental_health_missed_workpilot_student_privateracing_100mphrx_increasescuba_130ftseizure_car_accidenttbweight_lossweight_loss_amountweight_loss_reasoncancer_typecitizenlegal_residentskydive_countsurgery_typetravel_countries
17354751.004470female66.0154.0PAelite_nt24.85353598800yes400000.0yes-1.00.00.0FalsepharmacistnonetruefalsefalsefalsenonenonenoneFalse-1.0nofulltimenononenonenonenonenone0.0nonenoneFalsenoneTruenonefalsenone0.0noneyesnoneFalsenonefalsefalseFalseTrueFalsefalsefalsefalseFalse0.0nonenoneyesresident0.0noneAW
17354838.122617female65.0207.0PAessential_nt34.44284014958no0.0no-1.00.03.0TrueFoodservicenonefalsefalsefalsefalsenonenonenoneFalse-1.0noparttimenononenonenonenonenone0.0nonenoneFalsenoneTruenonefalsenone0.0nonenononeFalsenonefalsefalseFalseTrueFalsefalsefalsefalseFalse0.0nonenoneyesresident0.0nonenone
17354944.307549male72.0190.0PApreferred_nt25.765818227000yes300000.0no-1.01.00.0FalseSecurity systems installer - business owner and operatornonetruefalsefalsefalsenonenonenoneFalse-1.0nofulltimenononenonenonenonenone0.0nonenoneFalsenoneTruenonefalsenone0.0nonenononeFalsenonefalsefalseFalseTrueFalsefalsefalsefalseFalse0.0nonenoneyesresident0.0nonenone
17355037.156136male70.0185.0PAelite_nt26.541837117000no0.0no-1.01.00.0FalseEnvironmental ConsultingnonetruefalsefalsefalsenonenonenoneFalse-1.0nofulltimenononenonenonenonenone0.0nonenoneFalsenoneTruenonefalsenone0.0nonenononeFalsenonefalsefalseFalseTrueFalsefalsefalsefalseFalse0.0nonenoneyesresident0.0nonenone
17355140.364963female63.0130.0PAessential_nt23.0259510no0.0no-1.00.00.0FalseothernonetruefalsefalsefalsenonenonenoneFalse-1.0nostay_at_homenononenonetest_procedurenonenone0.0nonenoneFalsenoneTruenonefalsenone0.0nonenononeFalsenonefalsefalseFalseTrueFalsefalsefalsefalseFalse0.0nonenoneyesresident0.0nonenone
17355235.020569female61.0150.0MEdeclined28.33915670000no0.0no-1.00.02.0Truenurse / operations managernonetruefalsefalsefalsenonenonenoneFalse-1.0nofulltimenononenonenonenonenone0.0nonenoneFalsenoneTruenonefalsenone0.0nonenononeFalsenonefalsefalseFalseTrueFalsefalsefalsefalseTrue50.0surgerynoneyesresident0.0nonenone
17355341.312279female69.0198.0MEdeclined29.236295169000no0.0no-1.01.00.0FalsePharmacistnonetruefalsefalsefalsenonenonenoneFalse-1.0nofulltimenononenonesurgerynonenone0.0nonenoneFalsenoneTruenonefalsenone0.0nonenononeFalsenonefalsefalseFalseTrueFalsefalsefalsefalseFalse0.0nonenoneyesresident0.0othernone
17355440.214378male72.0196.0PAdeclined26.57947511772no0.0no-1.00.00.0FalseothernonetruefalsefalsefalsenonenonenoneFalse-1.0nounemployednononenonenonenonenone0.0nonenoneFalsenoneTruenonefalsenone0.0nonenononeFalsenonefalsefalseFalseTrueFalsefalsefalsefalseFalse0.0nonenoneyesresident0.0nonenone
17355532.887739male75.0200.0PAdeclined24.995556369200no0.0no-1.00.00.0FalseResearch EngineernonetruefalsefalsefalsenonenonenoneFalse-1.0nofulltimenononenonenonenonenone0.0nonenoneFalsenoneTruemisdemeanorfalsenone0.0nonenononeFalsenonefalsefalseFalseTrueFalsefalsefalsefalseFalse0.0nonenoneyesresident0.0nonenone
17355637.306721male70.0160.0PAdeclined22.95510236400no0.0no-1.00.05.0TrueConstructionnonetruefalsefalsefalsenonenonenoneFalse-1.0nofulltimenononenonesurgerynonenone0.0nonenoneFalsenoneTruemisdemeanorfalsenone0.0nonenononeTruenonefalsefalseFalseTrueFalsefalsefalsefalseFalse0.0nonenoneyesresident0.0othernone

Duplicate rows

Most frequently occurring

agegenderheightweightstateapproved_risk_classbmi_app_stateincome_app_statecurrent_inscurrent_ins_valuereplacement_insdiabetes_agealcohol_weeklymarijuana_monthly_countmarijuanaoccupation_descriptionrisky_activitiesvalid_drivers_license_app_statehiv_poscovidprevious_declinedprevious_decline_reasonchestpain_diagnosisdiabetes_complicationsdiabetes_gestationaldiabetes_hemoglobindiabetes_hospitalizationemployment_statusfamily_historyfinal_expenseinpatientmed_advicemed_conditionsseizure_diagnosisstroke_countstroke_diagnosisstroke_diagnosis_multiselecttest_proc_outstandingtest_proc_typeclimbing_equipmentcriminal_historydisability_audio_visualdisability_pmts_reasondui_countexpected_travel_90_daysexpected_travelexpected_travel_multiselectillicit_drugsmental_health_diagnosismental_health_hospitalizedmental_health_missed_workpilot_student_privateracing_100mphrx_increasescuba_130ftseizure_car_accidenttbweight_lossweight_loss_amountweight_loss_reasoncancer_typecitizenlegal_residentskydive_countsurgery_typetravel_countries# duplicates
1541.194549male72.0190.0TXelite_nt25.765818165696yes750000.0no-1.00.00.0FalseEngineernonetruefalsefalsefalsenonenonenoneFalse-1.0nofulltimenononenonenonenonenone0.0nonenoneFalsenoneTruenonefalsenone0.0nonenononeFalsenonefalsefalseFalseTrueFalsefalsefalsefalseFalse0.0nonenonenovisa0.0nonenone3
018.861441female64.0130.0MSessential_nt22.31201213000no0.0no-1.00.00.0FalseothernonetruefalsefalsefalsenonenonenoneFalse-1.0nofulltime_studentnononenonenonenonenone0.0nonenoneFalsenoneTruenonefalsenone0.0nonenononeFalsenonefalsefalseFalseTrueFalsefalsefalsefalseFalse0.0nonenoneyesresident0.0nonenone2
122.382390female63.0110.0GAselect_nt19.4834970no0.0no-1.00.00.0FalseothernonetruefalsefalsefalsenonenonenoneFalse-1.0noparttime_studentnononenonenonenonenone0.0nonenoneFalsenoneTruenonefalsenone0.0nonenononeFalsenonefalsefalseFalseTrueFalsefalsefalsefalseFalse0.0nonenoneyesresident0.0nonenone2
228.723382male65.0145.0CAelite_nt24.12662739000no0.0no-1.01.00.0FalseStockernonetruefalsefalsefalsenonenonenoneFalse-1.0nofulltimenononenonenonenonenone0.0nonenoneFalsenoneTruenonefalsenone0.0nonenononeFalsenonefalsefalseFalseTrueFalsefalsefalsefalseFalse0.0nonenonenoresident0.0nonenone2
328.824685female64.0160.0WVselect_nt27.4609383744no0.0no-1.00.00.0FalseothernonetruefalsefalsefalsenonenonenoneFalse-1.0noothernononenonenonenonenone0.0nonenoneFalsenoneTruenonefalsenone0.0nonenononeFalsenonefalsefalseFalseTrueFalsefalsefalsefalseFalse0.0nonenoneyesresident0.0nonenone2
430.338748female60.0130.0CAessential_nt25.386111208000no0.0no-1.00.00.0FalseIndependent contractornonefalsefalsefalsefalsenonenonenoneFalse-1.0nofulltimenononenonenonenonenone0.0nonenoneFalsenoneTruenonefalsenone0.0nonenononeFalsenonefalsefalseFalseTrueFalsefalsefalsefalseFalse0.0nonenoneyesresident0.0nonenone2
530.880853male74.0205.0CApreferred_nt26.317568150000yes450000.0no-1.01.00.0FalseFinancial AdvisornonetruefalsefalsefalsenonenonenoneFalse-1.0nofulltimenononenonetest_procedurenonenone0.0nonenoneTruecolonoscopyTruemisdemeanorfalsenone0.0nonenononeFalsenonefalsefalseFalseTrueFalsefalsefalsefalseFalse0.0nonenoneyesresident0.0nonenone2
631.091672female70.0270.0PAessential_nt38.73673552000no0.0no-1.00.00.0FalseCosmetologistnonetruefalsefalsefalsenonenonenoneFalse-1.0nofulltimenononenonenonenonenone0.0nonenoneFalsenoneTruenonefalsenone0.0nonenononeFalsenonefalsefalseFalseTrueFalsefalsefalsefalseFalse0.0nonenoneyesresident0.0nonenone2
731.915782female62.0160.0NCpreferred_nt29.26118667600no0.0no-1.03.00.0FalseHairstylistnonetruefalsefalsefalsenonenonenoneFalse-1.0nofulltimenononenonenonenonenone0.0nonenoneFalsenoneTruenonefalsenone0.0nonenononeFalsenonefalsefalseFalseTrueFalsefalsefalsefalseFalse0.0nonenoneyesresident0.0nonenone2
835.381972male71.0165.0TXselect_nt23.010315132000no0.0no-1.04.00.0FalseSoftware developernonetruefalsefalsefalsenonenonenoneFalse-1.0nofulltimenononenonenonenonenone0.0nonenoneFalsenoneTruenonefalsenone0.0nonenononeFalsenonefalsefalseFalseTrueFalsefalsefalsefalseFalse0.0nonenoneyesresident0.0nonenone2